Stefan's Blog

Sage vs SymPy - integration

During the recent GSoC summit I had the chance to participate in many fascinating discussions. One such occasion was while meeting the Sage representative.

A detail he mentioned, was that during his tests SymPy frequently failed to solve integrals that Sage (using Maxima) was able to solve. An explanation, in which I like to believe, would be that he was testing an old version of SymPy lacking the new integration routines implemented during the last few GSoC projects. Hence my decision the compare the most recent versions of both projects.

The tested versions are SymPy 0.7.2 and Sage 5.3.

Depending on screen size and wordpress theme the table might be badly formatted so here is a link to the wiki html page and a pdf version.

It should be noted that Sage is more rigorous about the assumptions on its symbols and so it fails to integrate something like $ x^n$ if $ n$ is not explicitly different than -1. I personally think that this is a feature and not a bug. Due to this difference however the script used to test Sage differs from the one used for SymPy.

Another minor methodological difference in the tests is the fact that the timeout pattern that I used failed to work for the Sage interpreter. Hence, SymPy integration timeouts at about 120 seconds while Sage integration is manually interrupted when it takes too much time.

Final methodological difference is that I purge the SymPy cache between each integral as otherwise the RAM usage becomes too great.

The results show that SymPy is slightly better in using special functions to solve integrals, but there are also a few integrals that Sage solves while SymPy fails to do so. On few occasions Sage fails disgracefully, meaning  that it returns an error instead of unevaluated integral. When both packages fail to evaluate the integral SymPy is much slower to say so (timeout for SymPy compared to 1 or 2 seconds for Sage to return an unevaluated integral). Finally, on some occasions the results by Sage seem better simplified.

Integrals solved better by SymPy (if you consider special functions "better"):

  • $ \frac{1}{a x^n + 1}$ with the use of a special function while Sage returns unevaluated integrals
  • $ \frac{a x^n}{b x^m + 1}$ with the use of a special function while Sage returns unevaluated integrals
  • $ \frac{a x^n + 1}{b x^m + 1}$ with the use of a special function while Sage returns unevaluated integrals
  • $ \frac{a x^5 + x^3 + 1}{b x^5 + x^3 + 1}$ with the use of a special function while Sage returns unevaluated integrals

Integrals solved better by Sage:

  • $ \frac{a x^2}{b x^2 + 1}$ solved by both but Sage's result is simpler (it uses arctan instead of log)
  • $ \frac{1}{\sqrt{x^2 + 1}}$ SymPy fails this simple integral
  • $ \log\left(\frac{a x^3}{b x^3 + 1}\right)$ solved by both but Sage's result is much simpler
  • $ \log\left(\frac{a x^2 + 1}{b x^2 + 1}\right)$ SymPy fails
  • $ \log\left(\frac{a x^3 + 1}{b x^3 + 1}\right)$ SymPy fails
  • $ \frac{1}{\sin x + 1}$ SymPy fails
  • $ \frac{a \sin^2 x + 1}{b \sin^2 x + 1}$ SymPy fails

The code for the SymPy tests:

import signal  
from time import clock  
from sympy import *  
from sympy.core.cache import clear_cache

class TimeoutException(Exception):  
pass

def timeout_handler(signum, frame):  
raise TimeoutException()

a, b, x = symbols('a b x')  
n, m = symbols('n m', integer=True)

integrants = [  
x,  
a*x**n,  
a*x**n + 1,  
a*x**b + 1,

1/x,  
1/(x + 1),  
1/(x**2 + 1),  
1/(x**3 + 1),  
1/(a*x**n),  
1/(a*x**n + 1),  
1/(a*x**b + 1),

a*x**2/(b*x**2 + 1),  
a*x**3/(b*x**3 + 1),  
a*x**n/(b*x**m + 1),  
(a*x**2 + 1)/(b*x**2 + 1),  
(a*x**3 + 1)/(b*x**3 + 1),  
(a*x**n + 1)/(b*x**m + 1),  
(a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1),

sqrt(1/x),  
sqrt(1/(x + 1)),  
sqrt(1/(x**2 + 1)),  
sqrt(1/(x**3 + 1)),  
sqrt(1/(a*x**n)),  
sqrt(1/(a*x**n + 1)),  
sqrt(1/(a*x**b + 1)),  
sqrt(a*x**2/(b*x**2 + 1)),  
sqrt(a*x**3/(b*x**3 + 1)),  
sqrt(a*x**n/(b*x**m + 1)),  
sqrt((a*x**2 + 1)/(b*x**2 + 1)),  
sqrt((a*x**3 + 1)/(b*x**3 + 1)),  
sqrt((a*x**n + 1)/(b*x**m + 1)),  
sqrt((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

log(x),  
log(1/x),  
log(1/(x + 1)),  
log(1/(x**2 + 1)),  
log(1/(x**3 + 1)),  
log(1/(a*x**n)),  
log(1/(a*x**n + 1)),  
log(1/(a*x**b + 1)),  
log(a*x**2/(b*x**2 + 1)),  
log(a*x**3/(b*x**3 + 1)),  
log(a*x**n/(b*x**m + 1)),  
log((a*x**2 + 1)/(b*x**2 + 1)),  
log((a*x**3 + 1)/(b*x**3 + 1)),  
log((a*x**n + 1)/(b*x**m + 1)),  
log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

sin(x),  
sin(x)**n*cos(x)**m,  
sin(a*x)**n*cos(b*x)**m,  
1/sin(x),  
1/(sin(x) + 1),  
1/(sin(x)**2 + 1),  
1/(sin(x)**3 + 1),  
1/(a*sin(x)**n),  
1/(a*sin(x)**n + 1),  
1/(a*sin(x)**b + 1),  
a*sin(x)**2/(b*sin(x)**2 + 1),  
a*sin(x)**3/(b*sin(x)**3 + 1),  
a*sin(x)**n/(b*sin(x)**m + 1),  
(a*sin(x)**2 + 1)/(b*sin(x)**2 + 1),  
(a*sin(x)**3 + 1)/(b*sin(x)**3 + 1),  
(a*sin(x)**n + 1)/(b*sin(x)**m + 1),  
(a*sin(x)**5 + sin(x)**3 + 1)/(b*sin(x)**5 + sin(x)**3 + 1),  
]

integrated = []  
durations = []

f_integrants = open('dump_integrants', 'w')  
f_integrated = open('dump_integrated', 'w')  
f_durations = open('dump_duration', 'w')

for index, integrant in enumerate(integrants):  
clear_cache()  
print '====================================='  
print index, ' of ', len(integrants)  
print '###', integrant  
start = clock()  
try:  
old_handler = signal.signal(signal.SIGALRM, timeout_handler)  
signal.alarm(120)  
integrated.append(integrate(integrant, x))  
signal.alarm(0)  
except TimeoutException:  
integrated.append(TimeoutException)  
finally:  
signal.signal(signal.SIGALRM, old_handler)  
durations.append(clock() - start)  
print '###', integrated[-1]  
print 'in %f seconds'%durations[-1]

f_integrants.write(str(integrant))  
f_integrated.write(str(integrated[-1]))  
f_durations.write(str(durations[-1]))  
f_integrants.write('\n')  
f_integrated.write('\n')  
f_durations.write('\n')  

And for Sage:

import signal  
from time import clock  
from sage.all import *  
from sage.symbolic.integration.integral import indefinite_integral

class TimeoutException(Exception):  
pass

def timeout_handler(signum, frame):  
raise TimeoutException()

a, b, x = var('a b x')  
n, m = var('n m')  
assume(n, 'integer')  
assume(m, 'integer')

assume(n != 1)  
assume(n != -1)  
assume(n != 2)  
assume(n\>0)  
assume(b != 1)  
assume(b != -1)  
assume(b\>0)  
assume(a\>0)

integrants = [  
x,  
a*x**n,  
a*x**n + 1,  
a*x**b + 1,

1/x,  
1/(x + 1),  
1/(x**2 + 1),  
1/(x**3 + 1),  
1/(a*x**n),  
1/(a*x**n + 1),  
1/(a*x**b + 1),

a*x**2/(b*x**2 + 1),  
a*x**3/(b*x**3 + 1),  
a*x**n/(b*x**m + 1),  
(a*x**2 + 1)/(b*x**2 + 1),  
(a*x**3 + 1)/(b*x**3 + 1),  
(a*x**n + 1)/(b*x**m + 1),  
(a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1),

sqrt(1/x),  
sqrt(1/(x + 1)),  
sqrt(1/(x**2 + 1)),  
sqrt(1/(x**3 + 1)),  
sqrt(1/(a*x**n)),  
sqrt(1/(a*x**n + 1)),  
sqrt(1/(a*x**b + 1)),  
sqrt(a*x**2/(b*x**2 + 1)),  
sqrt(a*x**3/(b*x**3 + 1)),  
sqrt(a*x**n/(b*x**m + 1)),  
sqrt((a*x**2 + 1)/(b*x**2 + 1)),  
sqrt((a*x**3 + 1)/(b*x**3 + 1)),  
sqrt((a*x**n + 1)/(b*x**m + 1)),  
sqrt((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

log(x),  
log(1/x),  
log(1/(x + 1)),  
log(1/(x**2 + 1)),  
log(1/(x**3 + 1)),  
log(1/(a*x**n)),  
log(1/(a*x**n + 1)),  
log(1/(a*x**b + 1)),  
log(a*x**2/(b*x**2 + 1)),  
log(a*x**3/(b*x**3 + 1)),  
log(a*x**n/(b*x**m + 1)),  
log((a*x**2 + 1)/(b*x**2 + 1)),  
log((a*x**3 + 1)/(b*x**3 + 1)),  
log((a*x**n + 1)/(b*x**m + 1)),  
log((a*x**5 + x**3 + 1)/(b*x**5 + x**3 + 1)),

sin(x),  
sin(x)**n*cos(x)**m,  
sin(a*x)**n*cos(b*x)**m,  
1/sin(x),  
1/(sin(x) + 1),  
1/(sin(x)**2 + 1),  
1/(sin(x)**3 + 1),  
1/(a*sin(x)**n),  
1/(a*sin(x)**n + 1),  
1/(a*sin(x)**b + 1),  
a*sin(x)**2/(b*sin(x)**2 + 1),  
a*sin(x)**3/(b*sin(x)**3 + 1),  
a*sin(x)**n/(b*sin(x)**m + 1),  
(a*sin(x)**2 + 1)/(b*sin(x)**2 + 1),  
(a*sin(x)**3 + 1)/(b*sin(x)**3 + 1),  
(a*sin(x)**n + 1)/(b*sin(x)**m + 1),  
(a*sin(x)**5 + sin(x)**3 + 1)/(b*sin(x)**5 + sin(x)**3 + 1),  
]

integrated = []  
durations = []

f_integrants = open('dump_integrants', 'w')  
f_integrated = open('dump_integrated', 'w')  
f_durations = open('dump_duration', 'w')

for index, integrant in enumerate(integrants):  
print '====================================='  
print index, ' of ', len(integrants)  
print '###', integrant  
start = clock()  
try:  
old_handler = signal.signal(signal.SIGALRM, timeout_handler)  
signal.alarm(120)  
integrated.append(indefinite_integral(integrant, x))  
signal.alarm(0)  
except Exception, e:  
integrated.append(e)  
finally:  
signal.signal(signal.SIGALRM, old_handler)  
durations.append(clock() - start)  
print '###', integrated[-1]  
print 'in %f seconds'%durations[-1]

f_integrants.write(str(integrant))  
f_integrated.write(str(integrated[-1]))  
f_durations.write(str(durations[-1]))  
f_integrants.write('\n')  
f_integrated.write('\n')  
f_durations.write('\n')  

The complete table (available as a wiki html page and a pdf).

Sage