- Published on
Unconventional use of Newton Method in CurveFinance AMM
1. The Necessity and Importance of Newton's Method in Solidity & Vyper
Just as the Internet evolved through Web 1.0 and Web 2.0, with the user-facing application layer ultimately capturing the most value, the blockchain world is not lacking in increasingly homogeneous infrastructure, but rather in application layers with diverse functionalities. If Web 3.0 truly takes off in the future, the value-capturing driving force will shift from the current base layer to the application layer. The authors believe that current EVM performance bottlenecks and computational gas consumption, combined with certain language features of Solidity and Vyper, discourage many developers from implementing various application functions. Among these challenges, the imbalance between numerical computation performance and resource consumption is a particularly troublesome problem that developers currently face. Many "fast" and "efficient" algorithms in Python, C++, and JavaScript must make various compromises when implemented in Solidity and Vyper. To achieve specific functionality, many project teams even need to independently implement algorithms that are difficult to understand, and these implemented algorithms are difficult to become universal tools for other developers to use directly. Understanding these algorithms requires collaborative effort among developers.
Curve is precisely the type of project team mentioned above. It is a typical project led by a single developer with numerous original ideas and innovations in engineering implementation. The importance of Curve in the blockchain world needs no elaboration.
Here we pay tribute to Michael Egorov, the great founder of Curve, and thank him for the various innovations he has brought to the blockchain. In this paper, we would like to present our own views and insights on a specific aspect, with three purposes:
- Elaborate on the Newton algorithm in the Curve project
- Discuss deformation methods and ideas of Newton's algorithm in different business scenarios
- Inspire research into other numerical optimization algorithms
The third point is the most important in our opinion, for the following reasons:
Obviously, Newton's method is not the fastest and most cost-effective universal algorithm for numerical optimization. Just as no DeFi project can guarantee itself to be bug-free, there is no numerical optimization algorithm in the DeFi world that can guarantee to be both the "fastest" and "most cost-effective" in every project. In this paper, we hope that subsequent researchers can continue in-depth research and discussion on the questions we raise, which are as follows (If possible, we hope the Curve project can provide grant support for the following questions):
- If Curve's Newton method is not optimal, which algorithm is the fastest and most "economical" (gas-efficient)?
- Build Solidity function libraries for Newton's method or variants of Newton's method to meet the basic numerical calculation needs of most other developers (save developers' time and effort).
2. The Code Implementation Idea of Newton's Method in Curve V1
Comments and code are inconsistent; here is a link to our article: https://0xreviews.xyz/2022/02/12/Curve-v1-StableSwap/
2.1 Introduction to Newton's Method
Since the equation cannot be solved directly in the contract, Curve uses Newton's method to iteratively find an approximate solution in the contract. A simple understanding of Newton's method is to use the above formula to continuously iterate the value to make it converge closer to the true solution.
2.2 Curve V1 — Derivation Process Using Newton's Method and Corresponding Code of Function get_D
The function f(D) of Newton's method used in the code is derived from the core formula. The derivation process is briefly described below:
- First, transform the core formula into the form
f(D) = 0
D_new = D - f(D)/f'(D)
- Eventually becomes the form in the code
Use the for loop to iterate to find the solution of D. When the difference between D and the previous solution is less than or equal to 1, stop the iteration.
Usually, the local optimal solution will be found within 4 rounds of iteration; however, if no suitable solution is found after 255 iterations, the program will terminate and the transaction will be reverted; users can use the remove_liquidity method to withdraw their assets.
Note: There is one missing term in the Newton's method formula in the comments, but the correct simplified formula is written in the code.
@pure
@internal
def _get_D(_xp: uint256[N_COINS], _amp: uint256) -> uint256:
"""
D invariant calculation in non-overflowing integer operations
iteratively
A * sum(x_i) * n**n + D = A * D * n**n + D**(n+1) / (n**n * prod(x_i))
Converging solution:
D[j+1] = (A * n**n * sum(x_i) - D[j]**(n+1) / (n**n prod(x_i))) / (A * n**n + (n+1)*D[j]**n/(n**n prod(x_i)) - 1)
Here the original code comment denominator misses one item
(n+1)*D[j]**n/(n**n prod(x_i))
Link of Original code :
https://github.com/curvefi/curve-contract/blob/master/contracts/pool-templates/base/SwapTemplateBase.vy#L214
"""
S: uint256 = 0
Dprev: uint256 = 0
for _x in _xp:
S += _x
if S == 0:
return 0
D: uint256 = S
Ann: uint256 = _amp * N_COINS
for _i in range(255):
D_P: uint256 = D
for _x in _xp:
D_P = D_P * D / (_x * N_COINS) # If division by 0, this will be borked: only withdrawal will work. And that is good
# Only when liquidity is removed is it possible that _x will be 0
# at which point the program will crash, which is also the result we expect
# The value of xp in the pool cannot be 0 when adding liquidity
Dprev = D
D = (Ann * S / A_PRECISION + D_P * N_COINS) * D / ((Ann - A_PRECISION) * D / A_PRECISION + (N_COINS + 1) * D_P)
# Equality with the precision of 1
# When the difference between the new value of the iteration and the previous value is less than or equal to 1, that is, 0 or 1
# The local optimal solution is considered to be found
if D > Dprev:
if D - Dprev <= 1:
return D
else:
if Dprev - D <= 1:
return D
# convergence typically occurs in 4 rounds or less, this should be unreachable!
# if it does happen the pool is borked and LPs can withdraw via `remove_liquidity`
raise
@view
@internal
def _get_D_mem(_balances: uint256[N_COINS], _amp: uint256) -> uint256:
return self._get_D(self._xp_mem(_balances), _amp)
2.3 Exploration, Innovation, and Conclusion of the Curve Team's Historical Errors on get_D
Learning the correct Newton's method from mistakes. When we refactored the Vyper-based Curve with Solidity, we discovered that the comments here are inconsistent with the source code (missing comments and errors in the Curve source code are relatively common).
The notes were originally written by Sam Werner, a member of the Curve team, who is a PhD student in the Centre for Cryptocurrency Research and Engineering at Imperial College. We initially thought that Sam, who was also a PhD from Imperial College, might have proposed a better method than the Curve CEO, so it was written in the comments. Therefore, we implemented a variant of Newton's method according to Sam's approach, which will not be further expanded due to space limitations in this article.
According to Sam's commentary approach, we do not need to change the numerator of Newton's method. By scaling and simplifying the denominator, we modify the slope of each tangent line of Newton's method. In theory, we can eventually achieve the same numerical result.
The final experimental result shows that changing the denominator, i.e., the slope of the tangent line in Newton's method, will not significantly shift the final result. However, it will greatly affect the convergence of Newton's method, which is problematic since we know that Newton's method in both Curve V1 and V2 requires certain precision within 255 operations.
The convergence of Newton's method: The conclusion of the global convergence of Newton's method was not proven until 2018. Newton's method exhibits quadratic convergence near the optimal point. The conclusion is that under Newton's method, the function needs at most steps to make the distance between the obtained optimal approximate solution and the actual optimal point less than . The common precision of DeFi is 10^18, so calculation shows that the result is about 4, which is also the theoretical basis for why most Newton's methods can obtain results in approximately 4 iterations.
The experimental results show that changing the slope of Newton's method will destroy the convergence rate of Newton's method, making Newton's method practically unusable.
Newton's method has many derived methods, including direct improvements of Newton's method (AN, HN, MN), damped Newton's method, and quasi-Newton methods (DFP, BFGS, L-BFGS). These algorithms are safer than directly scaling the derivatives in Newton's method. Direct scaling of Newton's method is rarely validated.
We also considered partial scaling in Curve V2 and found that the convergence speed of the results was not ideal.
All in all, while Sam Werner's comment here was not particularly good, it unexpectedly inspired us to have a deeper understanding and application of Newton's method.
2.4 Curve V1 — Derivation Process Using Newton's Method and Corresponding Code of Function get_y
Given input asset i, quantity x after change, and value _xp before change, calculate the quantity of asset j after change.
Like calculating D, Newton's method is also used here to calculate y, i.e., x_j, and the derivation process of f(x_j) is as follows:
Let
sum'
andprod'
be the cumulative sum and cumulative product excluding the output asset quantity, respectively:sum' = sum(x) - x_j
prod' = prod(x) / x_j
Divide the core formula by
A*n**n
Multiply by
x_j
and substitutesum'
andprod'
Expand the polynomial, which is the
f(x_j)
used by Newton's methodWrite it in the form of
x_j**2 + b*x_j = c
and substitute into the Newton's method formulax = x - f(x) / f'(x)
x_j = (x_j**2 + c) / (2*x_j + b)
@view
@internal
def _get_y(i: int128, j: int128, x: uint256, _xp: uint256[N_COINS]) -> uint256:
"""
Calculate x[j] if one makes x[i] = x
Done by solving quadratic equation iteratively.
x_1**2 + x_1 * (sum' - (A*n**n - 1) * D / (A * n**n)) = D ** (n + 1) / (n ** (2 * n) * prod' * A)
x_1**2 + b*x_1 = c
x_1 = (x_1**2 + c) / (2*x_1 + b)
"""
# x in the input is converted to the same price/precision
assert i != j # dev: same coin
assert j >= 0 # dev: j below zero
assert j < N_COINS # dev: j above N_COINS
# should be unreachable, but good for safety
assert i >= 0
assert i < N_COINS
A: uint256 = self._A()
D: uint256 = self._get_D(_xp, A)
Ann: uint256 = A * N_COINS
c: uint256 = D
S: uint256 = 0
_x: uint256 = 0
y_prev: uint256 = 0
for _i in range(N_COINS):
if _i == i:
_x = x
elif _i != j:
_x = _xp[_i]
else:
continue
S += _x
c = c * D / (_x * N_COINS)
c = c * D * A_PRECISION / (Ann * N_COINS)
b: uint256 = S + D * A_PRECISION / Ann # - D
y: uint256 = D
for _i in range(255):
y_prev = y
y = (y*y + c) / (2 * y + b - D)
# Equality with the precision of 1
if y > y_prev:
if y - y_prev <= 1:
return y
else:
if y_prev - y <= 1:
return y
raise
@view
@external
def get_dy(i: int128, j: int128, _dx: uint256) -> uint256:
xp: uint256[N_COINS] = self._xp()
rates: uint256[N_COINS] = RATES
x: uint256 = xp[i] + (_dx * rates[i] / PRECISION)
y: uint256 = self._get_y(i, j, x, xp)
dy: uint256 = xp[j] - y - 1
fee: uint256 = self.fee * dy / FEE_DENOMINATOR
return (dy - fee) * PRECISION / rates[j]
There is also a _get_y_D() function; the difference from the above is that you can customize the value of D to find y.
@pure
@internal
def _get_y_D(A: uint256, i: int128, _xp: uint256[N_COINS], D: uint256) -> uint256:
3. The Code Implementation Idea of Newton's Method in Curve V2
Newton's method in Curve V2 has certain modifications, and it is relatively difficult to fully understand without sufficient annotation. Here we ignore the technical details when introducing Newton's method. We look directly from the author's perspective at how he converts mathematical ideas into code step by step, hoping to inspire future developers to design more complex algorithms. For the convenience of better understanding, the code we show is from the Python version of the test file.
3.1 Curve V2 — Derivation Process Using Newton's Method and Corresponding Code of Function geometric_mean
Here we input array x to find its geometric mean as D. We denote D and N in the code as n and d, as shown in the formula:
d is the solution target of Newton's method. To obtain better derivation of the constructed function, both sides of the equation are raised to the nth power to obtain:
Move the right side of the equation to the left to construct the function f(d):
Take the derivative of f(d):
According to Newton's method, d_new = d_prev - f(d_prev)/f'(d_prev):
Remove the in the denominator:
Next, we change the equation into the form of numerator/denominator, so that it is convenient to simplify in the numerator first, and the principle of multiplication before division benefits precision control in Vyper & Solidity:
appears where the numerator is the nth power and the denominator is the (n-1)th power of d. To make both the numerator and denominator have n powers so that both can be iterated simultaneously when the code iterates, we multiply both numerator and denominator by d:
Then simplify the numerator and extract the common factor d; such simplification can optimize computational cost:
So far we can see in the code:
D = D * ((N - 1) * 10 ** 18 + tmp) // (N * 10**18)
Now it becomes clear. Among them, tmp is exactly the that we considered when deriving the transformation, which corresponds to the tmp obtained by performing n iterations of for loops. After adding precision control, it is exactly the same as the author's code.
def geometric_mean(x):
N = len(x)
x = sorted(x, reverse=True) # Presort - good for convergence
D = x[0]
for i in range(255):
D_prev = D
tmp = 10 ** 18
for _x in x:
tmp = tmp * _x // D
D = D * ((N - 1) * 10**18 + tmp) // (N * 10**18)
diff = abs(D - D_prev)
if diff <= 1 or diff * 10**18 < D:
return D
print(x)
raise ValueError("Did not converge")
3.2 Curve V2 — Derivation Process Using Newton's Method and Corresponding Code of Function newton_D
The formulas in the whitepaper are as follows, with the initial value of Newton's method being D0:
Equilibrium equation:
,
How did the author think of Newton's method? Why do intermediate variables such as mul1 and mul2 exist? We replace D with d and derive according to the author's idea: First, fully expand the equilibrium equation of Curve V2:
Take the derivative of the above formula with respect to d:
According to Newton's method:
Expand the formula and get:
The denominator that appears most frequently is , which is the expansion of in the equilibrium equation. For convenience of observation and simplification, we denote as g1k0, and the equation after replacement becomes:
At the same time, abbreviate as Prod, and as S. The above equation is simplified to:
Denote the numerator and denominator as:
Next, we observe and analyze the complex formula. We can see that most terms in the numerator and denominator contain as a common factor. And in the equilibrium equation can be seen to be very similar to this factor. Since K itself contains K0, and K0 contains N cycles and contains D, forcing K as a new common factor can be obtained:
We found that the factor appears in the denominator, and we will simplify the conversion of the similar K0 structural formula in the denominator to K0:
Then we observe that the denominator contains d, and we consider eliminating the denominator. Also notice that A and always appear together, and merge as ANN. Get the following formula:
Observe again, we find that the formula contains a lot of , we mark it as mul2, and at the same time mark the part of the first more complex formula that does not involve K0 and for loop as :
Denote -denominator/Kd^2 as neg_prime In the same way, the numerator can be transformed by Kd^2, and similar results can be obtained by substituting into the following formula:
Got the exact same core Newton's formula as in the code.
def newton_D(A, gamma, x, D0):
D = D0
i = 0
S = sum(x)
x = sorted(x, reverse=True)
N = len(x)
for i in range(255):
D_prev = D
K0 = 10**18
for _x in x:
K0 = K0 * _x * N // D
_g1k0 = abs(gamma + 10**18 - K0)
# D / (A * N**N) * _g1k0**2 / gamma**2
mul1 = 10**18 * D // gamma * _g1k0 // gamma * _g1k0 * A_MULTIPLIER // A
# 2*N*K0 / _g1k0
mul2 = (2 * 10**18) * N * K0 // _g1k0
neg_fprime = (S + S * mul2 // 10**18) + mul1 * N // K0 - mul2 * D // 10**18
assert neg_fprime > 0 # Python only: -f' > 0
# D -= f / fprime
D = (D * neg_fprime + D * S - D**2) // neg_fprime - D * (mul1 // neg_fprime) // 10**18 * (10**18 - K0) // K0
if D < 0:
D = -D // 2
if abs(D - D_prev) <= max(100, D // 10**14):
return D
raise ValueError("Did not converge")
3.3 Curve V2 — Derivation Process Using Newton's Method and Corresponding Code of Function newton_y
There is an error regarding the initial value in the Curve V2 whitepaper. We can set the initial values of Newton's method iteration for D and x_i respectively as follows to find the solution faster:
The definition of the initial value in the whitepaper is wrong. The power of D in the numerator is written as N-1, and the power of N in the denominator is written as N-1, which should both be N. We corrected them in the code for
x_i,0
.
Wrong definition of the initial value in the whitepaper:
How did the author come up with Newton's method, and why do intermediate variables such as mul1 and mul2 exist? We continue to follow the author's ideas and derive: In the same way, the equilibrium equation of Curve V2 is fully expanded. For the convenience of writing, we assume that x(n) is unknown and wait for the newton_y method to solve it. We expand the equilibrium equation and simplify the analysis as a function of x(n):
Derive f(x(n)) to get:
Similar to newton_D, we also found that making a replacement can greatly improve readability through observation.
denote as g1k0
denote as ANN
denote as S
Simplify as:
Similar to the newton_D function, we notice K through observation, and denote as Prod_i to obtain the following formula:
We noticed that Prod_i often appears in the numerator, and Prod_i*x(n)=Prod, so we transform the equation by 1/x(n) for simplification to get:
Continuing to observe, we can see that the above formula contains many terms. The following formula is obtained after transformation:
The denominator contains many d terms, so we can move 1/d outside the parentheses to get:
Observe again, we find that the formula contains a lot of S terms, and the coefficients are combined to get , we mark it as mul2, and at the same time mark the part of the first more complex formula that does not involve K0 and for loop as
denote x(n) as the y to be solved for newton_y:
Denote the part in parentheses in the above formula as yfprime:
We can see that it is consistent with the core iteration in the code.
In the same way, the numerator can be transformed by Kd^2, and similar results can be obtained by substituting into the following formula:
Got the exact same core Newton's formula as in the code.
def newton_y(A, gamma, x, D, i):
N = len(x)
y = D // N
K0_i = 10**18
S_i = 0
x_sorted = sorted(_x for j, _x in enumerate(x) if j != i)
convergence_limit = max(max(x_sorted) // 10**14, D // 10**14, 100)
for _x in x_sorted:
y = y * D // (_x * N) # Small _x first
S_i += _x
for _x in x_sorted[::-1]:
K0_i = K0_i * _x * N // D # Large _x first
for j in range(255):
y_prev = y
K0 = K0_i * y * N // D
S = S_i + y
_g1k0 = abs(gamma + 10**18 - K0)
# D / (A * N**N) * _g1k0**2 / gamma**2
mul1 = 10**18 * D // gamma * _g1k0 // gamma * _g1k0 * A_MULTIPLIER // A
# 2*K0 / _g1k0
mul2 = 10**18 + (2 * 10**18) * K0 // _g1k0
yfprime = (10**18 * y + S * mul2 + mul1 - D * mul2)
fprime = yfprime // y
assert fprime > 0 # Python only: f' > 0
# y -= f / f_prime; y = (y * fprime - f) / fprime
y = (yfprime + 10**18 * D - 10**18 * S) // fprime + mul1 // fprime * (10**18 - K0) // K0
if j > 100: # Just logging when doesn't converge
print(j, y, D, x)
if y < 0 or fprime < 0:
y = y_prev // 2
if abs(y - y_prev) <= max(convergence_limit, y // 10**14):
return y
raise Exception("Did not converge")
4. Applicability of Newton's Method and Summary of Code Ideas
4.1 The Applicability of Newton's Method in DeFi
Newton's method is a universal algorithm. Under the condition that a good initial value can be given, it can solve most equation evaluation problems without analytical solutions within 10 iterations. Therefore, it is suitable for most complex DeFi numerical algorithms, and we believe that this algorithm will become popular and be used by more and more people.
Popularizing Newton's method is also popularizing Curve V1 & V2 at the same time, which is also the original intention of our research. In the above article, we have seen that compared with Newton's method in textbooks, the application of Newton's method in practical business has a higher threshold.
4.2 Summary of the Application of Newton's Method in Vyper & Solidity
Decide what to simplify and propose as the sum of common factors, and determine what becomes the intermediate variable generated by the priority loop first: Some intermediate variables in the code implementation of Newton's method are generated by looping N times. These variables are often considered in combination with the variable forms that appear more frequently during mathematical simplification. For example, in the function newton_D, we found that K0 and K have a high occurrence rate, so in the design of intermediate variables, K0 or K should be given priority instead of or contained in K0.
The principle of multiplication before division: Minimize the appearance of division in the form of numerator/denominator and reduce fractions to a common denominator to make them multiplication operations. Pay attention to EVM numerical overflow when performing multiplication (consult us if there is a numerical overflow).
The derivation will reduce the precision of some formulas: It is necessary to manually adjust the precision of the code to meet the premise of consistent precision. Only the same precision can achieve the same result as in the mathematical formula.
Appropriately replace frequently occurring forms with simple symbols.
Prevent numerical overflow and the principle of large before small: Executing Newton's method in the EVM needs to match the appropriate division control precision balance to the data to be performed in the continuous multiplication loop to avoid numerical overflow. At the same time, note that the division of large numbers is performed first and then the division of small numbers is performed, i.e., the data needs to be rearranged from large to small.
4.3 Difficulties in Understanding the Code of Curve V2
The biggest difficulty in understanding Curve V2 compared to Uniswap V3 project is:
- Curve is more inclined toward mathematical optimization, and many DeFi code logics are original in the digital token community.
- More than 95% of Curve V2's code is contributed by Curve CEO Michael, with few comments, and the comments given by others have some errors.
- Michael's misrepresentation of some key points in the whitepaper makes it difficult to understand.
- Compared with Uniswap, Curve lacks other people's research as references.
4.4 Other Algorithms of Curve and Precision Control Problems
Due to space limitations, this paper does not present a particularly complete description of Newton's method used in Curve. Another difficulty in constructing Newton's method in actual business code is the problem of precision control, which is worth summarizing in another detailed article.
At the same time, there are other algorithms lacking annotations in the math library of Curve V2 that are difficult to understand. For example, the halfpow function is actually an application of the binomial algorithm in the non-integer field.
We have fully understood the mathematical algorithms and specific codes in Curve V1 & V2. At the same time, we also gained a complete understanding of the business code of Curve V1 & V2.