Curve newton's method (EN)
By 0xstan & 0xmc
Reference and Innovation of Newton's Method Code
1. The necessity and importance of Newton method in Solidity&Vyper
Just as when the Internet went through web1.0 and web2.0, it was the user-facing application layer that ultimately captured the most value, the blockchain world is not short of increasingly homogeneous infrastructure, but of application layers with different functions. If web3.0 really does take off in the future, the driving factor of capturing value will shift from the current base layer to the application layer. The author believes that the current EVM performance bottleneck and calculation gas consumption, plus some language features of solidity and vyper, make many developers currently discouraged from many application functions, among which the imbalance between numerical calculation performance and resource consumption is a relatively troublesome problem faced by developers at present. Many of the "fast" and "saving" algorithms in python, C++, Javascript have to make various compromises when implemented by solidity and vyper. In order to achieve a specific function, many project managers even need to independently implement an algorithm function that is difficult to understand, and these realized algorithms are difficult to become a universal wheel for other developers to use directly, understanding these algorithms requires developers to work together. In order to achieve a specific function, many project parties even need to independently implement an algorithm function that is difficult to understand, and it is difficult for these implemented algorithms to become a universal wheel for other developers to use directly, understanding these algorithms requires developers to work together.
Curve is exactly the project manager mentioned above. It is a typical project led by a single developer with many original ideas and innovations in engineering implementation. The importance of Curve in the blockchain world needs no elaboration.
Here we salute Michael Egorov, the great founder of Curve, and thank him for the various innovations he has brought to the blockchain. In this paper, I would like to offer my own views and insights on a small point, with three purposes:
- Elaborate the Newton algorithm in the Curve project
- Deformation methods and ideas of Newton algorithm in different business scenarios.
- Inspire the research of other numerical optimization algorithms
The third point is the most important thing in my opinion, for the following reasons :
Obviously, Newton's method is not the fastest and cheapest universal algorithm for numerical optimization. Just as no DEFI project can guarantee itself without any bugs, there is no numerical optimization algorithm in defi world that can guarantee it to be the "fastest" and the "cheapest" in the project. In this paper, we hope that subsequent researchers can continue in-depth research and discussion on the questions we raised, which are as follows (If possible, we hope the Curve Project can provide Grant support for the following questions) :
- If the Newton method of Curve is not optimal, which algorithm is the fastest and most "economical" (save gas)?
- Build solidity function wheels for Newton's method or a variant of Newton's method to meet the basic numerical calculation needs of most other developers (save developers' hair).
2. The code implementation idea of Newton's method in Curve V1
Comments and code are inconsistent, here is a link to my article: https://0xreviews.xyz/2022/02/12/Curve-v1-StableSwap/
2.1 Introduction of Newton's Method
Since the equation cannot be solved directly in the contract, Curve uses Newton's method to iteratively find an approximate solution in the contract. A simple understanding of Newton's method is to use the above formula to continuously iterate the value of to make it closer and closer to the real solution.
2.2 Curve V1—— Derivation process using Newton's method and corresponding code of function get_D
The function f(D) of Newton's method used in the code is derived from the core formula. The derivation process is briefly described below:
- First transform the core formula into the form
f(D) = 0
D_new = D - f(D)/f'(D)
- Eventually becomes the form in the code
-
Use the for loop to iterate to find the solution of D. When the difference between D and the previous solution is less than or equal to 1, stop the iteration;
-
Usually, the local optimal solution will be found within 4 rounds of iteration; however, if no suitable solution is found after 255 iterations, the program will terminate and the transaction will be rolled back (revert); users can use the remove_liquidity method to take back assets;
Note: There is one missing item in Newton's method formula in the comments, and the correct simplified formula is written in the code.
@pure
@internal
def _get_D(_xp: uint256[N_COINS], _amp: uint256) -> uint256:
"""
D invariant calculation in non-overflowing integer operations
iteratively
A * sum(x_i) * n**n + D = A * D * n**n + D**(n+1) / (n**n * prod(x_i))
Converging solution:
D[j+1] = (A * n**n * sum(x_i) - D[j]**(n+1) / (n**n prod(x_i))) / (A * n**n + (n+1)*D[j]**n/(n**n prod(x_i)) - 1)
Here the original code comment denominator misses one item
(n+1)*D[j]**n/(n**n prod(x_i))
Link of Original code :
https://github.com/curvefi/curve-contract/blob/master/contracts/pool-templates/base/SwapTemplateBase.vy#L214
"""
S: uint256 = 0
Dprev: uint256 = 0
for _x in _xp:
S += _x
if S == 0:
return 0
D: uint256 = S
Ann: uint256 = _amp * N_COINS
for _i in range(255):
D_P: uint256 = D
for _x in _xp:
D_P = D_P * D / (_x * N_COINS) # If division by 0, this will be borked: only withdrawal will work. And that is good
# Only when liquidity is removed is it possible that _x will be 0
# at which point the program will crash, which is also the result we expect
# The value of xp in the pool cannot be 0 when adding liquidity
Dprev = D
D = (Ann * S / A_PRECISION + D_P * N_COINS) * D / ((Ann - A_PRECISION) * D / A_PRECISION + (N_COINS + 1) * D_P)
# Equality with the precision of 1
# When the difference between the new value of the iteration and the previous value is less than or equal to 1, that is, 0 or 1
# The local optimal solution is considered to be found
if D > Dprev:
if D - Dprev <= 1:
return D
else:
if Dprev - D <= 1:
return D
# convergence typically occurs in 4 rounds or less, this should be unreachable!
# if it does happen the pool is borked and LPs can withdraw via `remove_liquidity`
raise
@view
@internal
def _get_D_mem(_balances: uint256[N_COINS], _amp: uint256) -> uint256:
return self._get_D(self._xp_mem(_balances), _amp)
2.3 Exploration, innovation, and conclusion of Curve Team's historical mistakes on get_D
Learn the right Newton's method from mistakes. When we refactored the vyper-based Curve with solidity, we found that the comments here are inconsistent with the source code (missing comments and errors in the Curve source code are relatively common).
The notes were first written by Sam Werner, a member of the Curve team, He is a PhD student in the Centre for Cryptocurrency Research and Engineering at Imperial. We thought at the time that Sam, who was also a PhD from Imperial College, might have proposed a better method than Curve CEO, so it was written in the comments. Therefore, we created a kind of Newton's method according to the method of Sam, which will not be further expanded due to the space limitation of this article.
According to Sam's commentary idea, we do not need to change the numerator of Newton's method. By scaling and simplifying the denominator, we change the slope of each tangent of Newton's method. In theory, we can eventually get the same numerical result.
The final result of the experiment is that changing the denominator, that is, the slope of the tangent line of Newton's method, will not shift the final result too much. But it will greatly affect the convergence of Newton's method, which is disastrous as we know that Newton's method in both Curve V1 and V2 requires a certain precision within 255 operations.
The convergence of Newton's method: The conclusion of the global convergence of Newton's method was not proven until 2018. Newton's method is quadratic convergence near the optimal point. The conclusion is that under Newton's method, the function only needs at most steps to make the distance between the obtained optimal approximate solution and the actual optimal point less than . The common precision of Defi is 10^18, the calculation shows that the result is about 4, which is also the basis of the author's theory that most of Newton's methods can get results in about 4 iterations.
The results of the experiment show that changing the slope of Newton's method will destroy the convergence rate of Newton's method, making Newton's method actually unavailable.
Newton's method has many following derived methods, including direct improvement of Newton's method (AN HN MN), damped Newton's method, and quasi-Newton's method (DFP, BFGS, L-BFGS). These algorithms are safer than directly scaling the derivatives in Newton's method. A direct scaling of Newton's method is seldomly verified.
We also considered partial scaling in Curve V2, and also found that the convergence speed of the results was not ideal.
All in all, it is true that Sam Werner here gave a not very good comment, but unexpectedly inspired us to have a deeper understanding and application of Newton's method.
2.4 Curve V1—— Derivation process using Newton's method and corresponding code of function get_y
Input asset i , quantity x after change, value _xp before change, calculate the quantity of asset j after change.
Like calculating D, Newton's method is also used here to calculate y, that is, x_j , and the derivation process of f(x_j) is as follows:
-
Let
sum'
,prod'
be the cumulative and cumulative results of excluding the number of output assets, respectivelysum' = sum(x) - x_j
prod' = prod(x) / x_j
-
Divide the core formula by
A*n**n
-
Multiply by
x_j
,and substitutesum'
andprod'
-
Expand the polynomial, which is the
f(x_j)
used by Newton's method -
Write it in the form of
x_j**2 + b*x_j = c
,substitute into the Newton's method formulax = x - f(x) / f'(x)
x_j = (x_j**2 + c) / (2*x_j + b)
@view
@internal
def _get_y(i: int128, j: int128, x: uint256, _xp: uint256[N_COINS]) -> uint256:
"""
Calculate x[j] if one makes x[i] = x
Done by solving quadratic equation iteratively.
x_1**2 + x_1 * (sum' - (A*n**n - 1) * D / (A * n**n)) = D ** (n + 1) / (n ** (2 * n) * prod' * A)
x_1**2 + b*x_1 = c
x_1 = (x_1**2 + c) / (2*x_1 + b)
"""
# x in the input is converted to the same price/precision
assert i != j # dev: same coin
assert j >= 0 # dev: j below zero
assert j < N_COINS # dev: j above N_COINS
# should be unreachable, but good for safety
assert i >= 0
assert i < N_COINS
A: uint256 = self._A()
D: uint256 = self._get_D(_xp, A)
Ann: uint256 = A * N_COINS
c: uint256 = D
S: uint256 = 0
_x: uint256 = 0
y_prev: uint256 = 0
for _i in range(N_COINS):
if _i == i:
_x = x
elif _i != j:
_x = _xp[_i]
else:
continue
S += _x
c = c * D / (_x * N_COINS)
c = c * D * A_PRECISION / (Ann * N_COINS)
b: uint256 = S + D * A_PRECISION / Ann # - D
y: uint256 = D
for _i in range(255):
y_prev = y
y = (y*y + c) / (2 * y + b - D)
# Equality with the precision of 1
if y > y_prev:
if y - y_prev <= 1:
return y
else:
if y_prev - y <= 1:
return y
raise
@view
@external
def get_dy(i: int128, j: int128, _dx: uint256) -> uint256:
xp: uint256[N_COINS] = self._xp()
rates: uint256[N_COINS] = RATES
x: uint256 = xp[i] + (_dx * rates[i] / PRECISION)
y: uint256 = self._get_y(i, j, x, xp)
dy: uint256 = xp[j] - y - 1
fee: uint256 = self.fee * dy / FEE_DENOMINATOR
return (dy - fee) * PRECISION / rates[j]
There is also a _get_y_D() function, the difference from the above is that you can customize the value of D to find y.
@pure
@internal
def _get_y_D(A: uint256, i: int128, _xp: uint256[N_COINS], D: uint256) -> uint256:
3. The code implementation idea of Newton's method in Curve V2
The Newton method in Curve V2 has certain deformations, and it is relatively difficult to fully understand under the premise of insufficient annotation. Here we ignore the technical details when we introduce Newton's method. Here we look directly from the author's point of view, how he converts mathematical ideas into code step by step, hoping to inspire future generations to design more complex algorithms. Here for the convenience of better understanding, the code we show is the code in the Python version of the test file.
3.1 Curve V2—Derivation process using Newton's method and corresponding code of function geometric_mean
Here we input array x to find its geometric mean as D. We denote D and N in the code as n and d, as shown in the formula:
d is the solution target of Newton's method. In order to obtain a better derivation of the constructed function, both sides of the equation are enlarged to the nth power at the same time to obtain
Move the right side of the equation to the left to construct the function f(d)
Take derivative of f(x)
According to Newton's method, d_new= d_prev-f(d_prev)/f'(d_prev)
Remove the in the denominator, we get
Next, we change the equation into the form of numerator/denominator, so that it is convenient to simplify in the numerator first, and the principle of first multiplication and then division is beneficial to the control the accuracy of vyper&solidity.
appears, the numerator is the n th item of d, and the denominator is the n-1 th item of d. In order to make the numerator and denominator both n items so that the two can be iterated at the same time when the code is iterated, we naturally thought of here. The function becomes the same n times, multiply the numerator and denominator by d at the same time
then simplify numerator and extract the common factor d, simplification like that can optimize the computational cost.
So far we see in the code
D = D * ((N - 1) * 10 ** 18 + tmp) // (N * 10**18)
Now it has come to light. Among them, tmp is exactly the that we considered when deriving the deformation, which just corresponds to the tmp obtained by performing n times of for loops. After adding the precision control, it is exactly the same as the author's code.
def geometric_mean(x):
N = len(x)
x = sorted(x, reverse=True) # Presort - good for convergence
D = x[0]
for i in range(255):
D_prev = D
tmp = 10 ** 18
for _x in x:
tmp = tmp * _x // D
D = D * ((N - 1) * 10**18 + tmp) // (N * 10**18)
diff = abs(D - D_prev)
if diff <= 1 or diff * 10**18 < D:
return D
print(x)
raise ValueError("Did not converge")
3.2 Curve V2—Derivation process using Newton's method and corresponding code of function newton_D
The formulas in the white paper are as follows, the initial value of Newton's method is D0
Equilibrium equation:
,
How did the author think of Newton's method? Why do intermediate variables such as mul1 and mul2 exist? We replace D with d and deduce according to the author's idea: First, fully expand the equilibrium equation of Curve V2
Derivation of the above formula with respect to d :
According to Newton's method:
Expand the formula and get :
The denominator that appears more often is , which is the expansion of in the equilibrium equation. For the convenience of observation and simplification, we denote as g1k0, and the equation after replacement becomes :
At the same time, is abbreviated as Prod, and is abbreviated as S. The above equation is simplified to :
Denote the numerator and denominator as:
Next, we observe and analyze the complex formula, and we can see that most of the terms in the numerator and denominator contain as a common factor. And in the equilibrium equation can be seen to be very similar to this factor. And K itself contains K0, and K0 contains N cycles and contains D. Therefore, forcing K as a new common factor can be obtained :
We found that the factor appears in the denominator, and we will simplify the conversion of the similar K0 structural formula in the denominator to K0:
Then we observe that the denominator contains ,and we consider eliminating the denominator. Also notice that and always appear at the same time, and merge as ANN. Get the following formula:
Observe again, we find that the formula contains a lot of , we mark it as mul2, and at the same time mark the part of the first more complex formula that does not involve K0 and for loop as
Denote -denominator/Kd^2 as neg_prime In the same way, the molecule can be deformed by Kd^2, and similar results can be obtained by substituting it into the following formula :
Got the exact same core Newton's formula as in the code.
def newton_D(A, gamma, x, D0):
D = D0
i = 0
S = sum(x)
x = sorted(x, reverse=True)
N = len(x)
for i in range(255):
D_prev = D
K0 = 10**18
for _x in x:
K0 = K0 * _x * N // D
_g1k0 = abs(gamma + 10**18 - K0)
# D / (A * N**N) * _g1k0**2 / gamma**2
mul1 = 10**18 * D // gamma * _g1k0 // gamma * _g1k0 * A_MULTIPLIER // A
# 2*N*K0 / _g1k0
mul2 = (2 * 10**18) * N * K0 // _g1k0
neg_fprime = (S + S * mul2 // 10**18) + mul1 * N // K0 - mul2 * D // 10**18
assert neg_fprime > 0 # Python only: -f' > 0
# D -= f / fprime
D = (D * neg_fprime + D * S - D**2) // neg_fprime - D * (mul1 // neg_fprime) // 10**18 * (10**18 - K0) // K0
if D < 0:
D = -D // 2
if abs(D - D_prev) <= max(100, D // 10**14):
return D
raise ValueError("Did not converge")
3.3 Curve V2—Derivation process using Newton's method and corresponding code of function newton_y
There is an error about the initial value in the Curve V2 white paper. We can set the initial value of Newton's method iteration of D and x_i respectively as follows to find the solution faster
The definition of the initial value in the white paper is wrong. The power of D in the numerator is written as N-1, and the power of N in the denominator is written as N-1, which should all be N, we corrected them in the code
x_i,0
Wrong definition of the initial value in the white paper :
How did the author come up with Newton's method, and why do intermediate variables such as mul1 and mul2 exist? We continue to follow the author's ideas and deduce : In the same way, the equilibrium equation of Curve V2 is fully expanded. For the convenience of writing, we assume that x(n) is unknown and wait for the newton_y method to solve it. We expand the equilibrium equation and simplify the analysis as a function of wait
Derive to get :
Similar to newto`n_D, we also found that making a replacement can greatly improve the readability through observation.
denote as g1k0
denote as ANN
denote as S
Simplify as :
Similar to the newton_D function, we notice K through observation, and denote as to obtain the following formula :
We noticed that often appears in the molecule, and ,so we deform the equation by for simplification to get :
Continuing to observe, we can see that the above formula contains many the following formula is obtained after deformation:
The denominator contains many d, we can move outside the parentheses to get :
Observe again, we find that the formula contains a lot of S, and the coefficients are combined to get , we mark it as mul2, and at the same time mark the part of the first more complex formula that does not involve K0 and for loop as
denote x(n) as the y to be solved for newton_y
Denote the part in parentheses in the above formula as yfprime
We can see that it is consistent with the core iteration in the code
In the same way, the molecule can be deformed by Kd^2, and similar results can be obtained by substituting it into the following formula :
Got the exact same core Newton's formula as in the code.
def newton_y(A, gamma, x, D, i):
N = len(x)
y = D // N
K0_i = 10**18
S_i = 0
x_sorted = sorted(_x for j, _x in enumerate(x) if j != i)
convergence_limit = max(max(x_sorted) // 10**14, D // 10**14, 100)
for _x in x_sorted:
y = y * D // (_x * N) # Small _x first
S_i += _x
for _x in x_sorted[::-1]:
K0_i = K0_i * _x * N // D # Large _x first
for j in range(255):
y_prev = y
K0 = K0_i * y * N // D
S = S_i + y
_g1k0 = abs(gamma + 10**18 - K0)
# D / (A * N**N) * _g1k0**2 / gamma**2
mul1 = 10**18 * D // gamma * _g1k0 // gamma * _g1k0 * A_MULTIPLIER // A
# 2*K0 / _g1k0
mul2 = 10**18 + (2 * 10**18) * K0 // _g1k0
yfprime = (10**18 * y + S * mul2 + mul1 - D * mul2)
fprime = yfprime // y
assert fprime > 0 # Python only: f' > 0
# y -= f / f_prime; y = (y * fprime - f) / fprime
y = (yfprime + 10**18 * D - 10**18 * S) // fprime + mul1 // fprime * (10**18 - K0) // K0
if j > 100: # Just logging when doesn't converge
print(j, y, D, x)
if y < 0 or fprime < 0:
y = y_prev // 2
if abs(y - y_prev) <= max(convergence_limit, y // 10**14):
return y
raise Exception("Did not converge")
4. Applicability of Newton's method and summary of code ideas.
4.1 The applicability of Newton's method in Defi.
Newton's method is a panacea algorithm. Under the condition that a good initial value can be given, it can solve most of the equation evaluation problems without analytical solutions within 10 times. Therefore, it is suitable for most of the more complex Defi numerical algorithms, and we also believe that this algorithm will come into fashion and be used by more and more people.
Popularizing Newton's method is also popularizing for Curve V1&V2 at the same time, which is also the original intention of our research. In the above article, we have seen that compared with Newton's method in the textbook, employment of Newton's method in practical business has a higher threshold.
4.2 Summary of the application of Newton's method in vyper&solidity.
- Decide who to simplify and propose as the sum of the common factors, and let who becomes the intermediate variable generated by the priority loop first : Some intermediate variables in the code implementation of Newton's method are generated by looping N times. These variables are often considered in combination with the more variable forms that appear during mathematical simplification. For example, in the function newton_D, we found that K0 and K have a high occurrence rate, so in the design of intermediate variables, K0 or K should be given priority instead of or contained in .
- The principle of multiplying before dividing Minimize the appearance of division in the form of numerator/denominator and reduce the fraction to a common denominator to make it a multiplication operation. Pay attention to EVM numerical overflow when doing multiplication(consult us if there is a numerical overflow).
- The derivation will reduce the accuracy of some formulas It is necessary to manually change the precision of the code to meet the premise of consistent precision. Only the same precision can get the same result as in the mathematical formula.
- Appropriately replace frequently occurring forms with simple symbols.
- Prevent numerical overflow and the principle of first large then small Executing Newton's method in the EVM needs to match the appropriate division control precision balance to the data to be performed in the continuous multiplication loop to avoid numerical overflow. At the same time, note that the division of large numbers is performed first and then the division of decimals is performed, that is, the data needs to be rearranged from large to small.
4.3 Difficulties in understanding the code of Curve V2
The biggest difficulty in understanding Curve V2 compared to Uniswap V3 project distance is :
- Curve is more inclined to mathematical optimization, and many Defi code logics are original in the digital token community.
- More than 95% of Curve V2's code is contributed by Curve CEO Michael, with few comments, and the comments given by others have some errors.
- Michael's misrepresentation of some key points in the white paper makes it difficult to understand.
- Compared with Uniswap, Curve lacks other people's research as references.
4.4 Other algorithms of Curve and precision control problems
Due to space limitations, this paper does not present a particularly complete description of Newton's method used in Curve. Another difficulty in constructing Newton's method in actual business code is the problem of precision control, which is worth summarizing in another detailed article.
At the same time, there are other algorithms lacking annotations in the math library of Curve V2 that are difficult to understand. For example, the halfpow function is actually an application of the binomial algorithm in the non-integer field.
We have fully understood the mathematical algorithms and specific codes in Curve V1&V2. At the same time, we also gained a complete understanding of the business code of Curve V1&V2.
At present, 0xstan has completed the solidity-based reconstruction of the core contract of Curve V1.
At present, the Code Review section of Curve V1 and Curve V2 has been put on the official website of 0xreviews.xyz and 0xmc.com.