Bulk AnalysisΒΆ
Androguard is capable of analysing probably thousand to millions of APKs.
It is also possible to use tools like multiprocessing for this job and
analyse APKs in parallel.
Usually you want to put the results of your analysis somewhere, for example a
database or some log file.
It is also possile to use Session
objects for this
job, but you should be aware of some caveats!
1) Sessions take up a lot of space per APK. The resulting Session object can be more than 30 times larger than the original APK 2) Sessions should not be used to add unrelated APKs, again the size will blow up and you need to figure out which APK belongs to where
So the rule of thumb would be to not use Sessions for bulk analysis, only if you
know what you are doing.
Another way is to pickle the resulting objects.
As the DalvikVMFormat
are already stored
in the Analysis
object, there is no
need to pickle them separately.
Thus, it is only required to save the
APK
and
Analysis
object.
This is an example how to obtain the two objects and saving them to disk:
import sys
from pickle import dump
from hashlib import sha512
from androguard.misc import AnalyzeAPK
a, _, dx = AnalyzeAPK('examples/tests/a2dp.Vol_137.apk')
sha = sha512()
sha.update(a.get_raw())
with open("{}_apk.p".format(sha.hexdigest()), "wb") as fp:
dump(a, fp)
with open("{}_analysis.p".format(sha.hexdigest()), "wb") as fp:
# It looks like here is the recursion problem...
sys.setrecursionlimit(50000)
dump(dx, fp)
But the resulting files are very large, especially the Analysis package:
$ du -sh examples/tests/a2dp.Vol_137.apk
808K examples/tests/a2dp.Vol_137.apk
$ du -sh *.p
31M 24a62690a770891a8f43d71e8f7beb24821d46a75e017ef4f4e6a04624105466621c96305d8e86f9900042e3ef1d5806a5d9ac873bebdf798483790446bd275e_analysis.p
852K 24a62690a770891a8f43d71e8f7beb24821d46a75e017ef4f4e6a04624105466621c96305d8e86f9900042e3ef1d5806a5d9ac873bebdf798483790446bd275e_apk.p
But it is possible to compress both files to save disk space:
import sys
import lzma
from pickle import dump
from hashlib import sha512
from androguard.misc import AnalyzeAPK
a, _, dx = AnalyzeAPK('examples/tests/a2dp.Vol_137.apk')
sha = sha512()
sha.update(a.get_raw())
with lzma.open("{}_apk.p.lzma".format(sha.hexdigest()), "wb") as fp:
dump(a, fp)
with lzma.open("{}_analysis.p.lzma".format(sha.hexdigest()), "wb") as fp:
# It looks like here is the recursion problem...
sys.setrecursionlimit(50000)
dump(dx, fp)
which results in much smaller files:
$ du -sh *.lzma
4,5M 24a62690a770891a8f43d71e8f7beb24821d46a75e017ef4f4e6a04624105466621c96305d8e86f9900042e3ef1d5806a5d9ac873bebdf798483790446bd275e_analysis.p.lzma
748K 24a62690a770891a8f43d71e8f7beb24821d46a75e017ef4f4e6a04624105466621c96305d8e86f9900042e3ef1d5806a5d9ac873bebdf798483790446bd275e_apk.p.lzma
Obviously, as the APK is already packed, there is not much to compress anymore.