We must have heard about sparse and dense matrices in Python. I tried to analyse the space usage between these two
The code is as follows
RED : Length of Array
GREEN : Log of Size of Sparse Array
BLUE : Log of Size of Dense Array
The difference is just immense. I am trying to understand how the underlying data structure works for these matrices
The code is as follows
import getsizeof
import numpy as np
import scipy.sparse as sps
import sys
def createGeometric(start,ratio,lengthofSequence):
i=0
while i < lengthofSequence:
yield(start * pow(ratio,i))
i=i+1
numpySizeList=[]
scipySizeList=[]
lengthList=[]
for value in createGeometric(1,3,10):
a=np.random.rand(value)
b=sps.csr_matrix(np.random.rand(value))
numpySizeList.append(sys.getsizeof(a))
scipySizeList.append(sys.getsizeof(b))
lengthList.append(value)
import matplotlib.pyplot as plt
import math
plt.figure()
numpySizeList=map(lambda x:math.log(x),numpySizeList)
scipySizeList=map(lambda x:math.log(x),scipySizeList)
lengthList=map(lambda x:math.log(x),lengthList)
plt.plot(numpySizeList)
plt.plot(scipySizeList)
plt.plot(lengthList)
I plotted the space usageRED : Length of Array
GREEN : Log of Size of Sparse Array
BLUE : Log of Size of Dense Array
The difference is just immense. I am trying to understand how the underlying data structure works for these matrices