androguard.core.bytecodes package

The bytecodes modules are one very important core feature of Androguard. They contain parsers for APK, AXML, DEX, ODEX and DEY files as well for formats used inside these formats. These might be MUTF-8 for string encoding in DEX files as well as the widely used LEB128 encoding for numbers.

The most important modules might be androguard.core.bytecodes.apk.APK and androguard.core.bytecodes.dvm.DalvikVMFormat.

Submodules

androguard.core.bytecodes.apk module

class androguard.core.bytecodes.apk.APK(filename, raw=False, magic_file=None, skip_analysis=False, testzip=False)

Bases: object

files

Returns a dictionary of filenames and detected magic type

Returns:dictionary of files and their mime type
find_tags(tag_name, **attribute_filter)

Return a list of all the matched tags in all available xml

Parameters:tag (str) – specify the tag name
find_tags_from_xml(xml_name, tag_name, **attribute_filter)

Return a list of all the matched tags in a specific xml w :param str xml_name: specify from which xml to pick the tag from :param str tag_name: specify the tag name

get_activities()

Return the android:name attribute of all activities

Return type:a list of str
get_all_attribute_value(tag_name, attribute, format_value=True, **attribute_filter)

Yields all the attribute values in xml files which match with the tag name and the specific attribute

Parameters:
  • tag_name (str) – specify the tag name
  • attribute (str) – specify the attribute
  • format_value (bool) – specify if the value needs to be formatted with packagename
get_all_dex()

Return the raw data of all classes dex files

Return type:a generator of bytes
get_android_manifest_axml()

Return the AXMLPrinter object which corresponds to the AndroidManifest.xml file

Return type:AXMLPrinter
get_android_manifest_xml()

Return the parsed xml object which corresponds to the AndroidManifest.xml file

Return type:Element
get_android_resources()

Return the ARSCParser object which corresponds to the resources.arsc file

Return type:ARSCParser
get_androidversion_code()

Return the android version code

This information is read from the AndroidManifest.xml

Return type:str
get_androidversion_name()

Return the android version name

This information is read from the AndroidManifest.xml

Return type:str
get_app_icon(max_dpi=65536)

Return the first icon file name, which density is not greater than max_dpi, unless exact icon resolution is set in the manifest, in which case return the exact file.

This information is read from the AndroidManifest.xml

From https://developer.android.com/guide/practices/screens_support.html and https://developer.android.com/ndk/reference/group___configuration.html

  • DEFAULT 0dpi
  • ldpi (low) 120dpi
  • mdpi (medium) 160dpi
  • TV 213dpi
  • hdpi (high) 240dpi
  • xhdpi (extra-high) 320dpi
  • xxhdpi (extra-extra-high) 480dpi
  • xxxhdpi (extra-extra-extra-high) 640dpi
  • anydpi 65534dpi (0xFFFE)
  • nodpi 65535dpi (0xFFFF)

There is a difference between nodpi and anydpi: nodpi will be used if no other density is specified. Or the density does not match. nodpi is the fallback for everything else. If there is a resource that matches the DPI, this is used. anydpi is also valid for all densities but in this case, anydpi will overrule all other files! Therefore anydpi is usually used with vector graphics and with constraints on the API level. For example adaptive icons are usually marked as anydpi.

When it comes now to selecting an icon, there is the following flow:

  1. is there an anydpi icon?
  2. is there an icon for the dpi of the device?
  3. is there a nodpi icon?
  4. (only on very old devices) is there a icon with dpi 0 (the default)

For more information read here: https://stackoverflow.com/a/34370735/446140

Return type:str
get_app_name()

Return the appname of the APK

This name is read from the AndroidManifest.xml using the application android:label. If no label exists, the android:label of the main activity is used.

If there is also no main activity label, an empty string is returned.

Return type:str
get_attribute_value(tag_name, attribute, format_value=False, **attribute_filter)

Return the attribute value in xml files which matches the tag name and the specific attribute

Parameters:
  • tag_name (str) – specify the tag name
  • attribute (str) – specify the attribute
  • format_value (bool) – specify if the value needs to be formatted with packagename
get_certificate(filename)

Return a X.509 certificate object by giving the name in the apk file

Parameters:filename – filename of the signature file in the APK
Returns:a Certificate certificate
get_certificate_der(filename)

Return the DER coded X.509 certificate from the signature file.

Parameters:filename – Signature filename in APK
Returns:DER coded X.509 certificate as binary
get_certificates()

Return a list of unique asn1crypto.x509.Certificate which are found in v1, v2 and v3 signing Note that we simply extract all certificates regardless of the signer. Therefore this is just a list of all certificates found in all signers.

get_certificates_der_v2()

Return a list of DER coded X.509 certificates from the v3 signature block

get_certificates_der_v3()

Return a list of DER coded X.509 certificates from the v3 signature block

get_certificates_v1()

Return a list of asn1crypto.x509.Certificate which are found in the META-INF folder (v1 signing). Note that we simply extract all certificates regardless of the signer. Therefore this is just a list of all certificates found in all signers.

get_certificates_v2()

Return a list of asn1crypto.x509.Certificate which are found in the v2 signing block. Note that we simply extract all certificates regardless of the signer. Therefore this is just a list of all certificates found in all signers.

get_certificates_v3()

Return a list of asn1crypto.x509.Certificate which are found in the v3 signing block. Note that we simply extract all certificates regardless of the signer. Therefore this is just a list of all certificates found in all signers.

get_declared_permissions()

Returns list of the declared permissions.

Return type:list of strings
get_declared_permissions_details()

Returns declared permissions with the details.

Return type:dict
get_details_permissions()

Return permissions with details.

THis can only return details about the permission, if the permission is defined in the AOSP.

Return type:dict of {permission: [protectionLevel, label, description]}
get_dex()

Return the raw data of the classes dex file

This will give you the data of the file called classes.dex inside the APK. If the APK has multiple DEX files, you need to use get_all_dex().

Return type:bytes
get_dex_names()

Return the names of all DEX files found in the APK. This method only accounts for “offical” dex files, i.e. all files in the root directory of the APK named classes.dex or classes[0-9]+.dex

Return type:a list of str
get_effective_target_sdk_version()

Return the effective targetSdkVersion, always returns int > 0.

If the targetSdkVersion is not set, it defaults to 1. This is set based on defaults as defined in: https://developer.android.com/guide/topics/manifest/uses-sdk-element.html

Return type:int
get_element(tag_name, attribute, **attribute_filter)

Deprecated since version 3.3.5: use get_attribute_value() instead

Return element in xml files which match with the tag name and the specific attribute

Parameters:
  • tag_name (str) – specify the tag name
  • attribute (str) – specify the attribute
Return type:

str

get_elements(tag_name, attribute, with_namespace=True)

Deprecated since version 3.3.5: use get_all_attribute_value() instead

Return elements in xml files which match with the tag name and the specific attribute

Parameters:
  • tag_name (str) – a string which specify the tag name
  • attribute (str) – a string which specify the attribute
get_features()

Return a list of all android:names found for the tag uses-feature in the AndroidManifest.xml

Returns:list
get_file(filename)

Return the raw data of the specified filename inside the APK

Return type:bytes
get_filename()

Return the filename of the APK

Return type:str
get_files()

Return the file names inside the APK.

Return type:a list of str
get_files_crc32()

Calculates and returns a dictionary of filenames and CRC32

Returns:dict of filename: CRC32
get_files_information()

Return the files inside the APK with their associated types and crc32

Return type:str, str, int
get_files_types()

Return the files inside the APK with their associated types (by using python-magic)

At the same time, the CRC32 are calculated for the files.

Return type:a dictionnary
get_intent_filters(itemtype, name)

Find intent filters for a given item and name.

Intent filter are attached to activities, services or receivers. You can search for the intent filters of such items and get a dictionary of all attached actions and intent categories.

Parameters:
  • itemtype – the type of parent item to look for, e.g. activity, service or receiver
  • name – the android:name of the parent item, e.g. activity name
Returns:

a dictionary with the keys action and category containing the android:name of those items

get_libraries()

Return the android:name attributes for libraries

Return type:list
get_main_activities()

Return names of the main activities

These values are read from the AndroidManifest.xml

Return type:a set of str
get_main_activity()

Return the name of the main activity

This value is read from the AndroidManifest.xml

Return type:str
get_max_sdk_version()

Return the android:maxSdkVersion attribute

Return type:string
get_min_sdk_version()

Return the android:minSdkVersion attribute

Return type:string
get_package()

Return the name of the package

This information is read from the AndroidManifest.xml

Return type:str
get_permissions()

Return permissions names declared in the AndroidManifest.xml.

It is possible that permissions are returned multiple times, as this function does not filter the permissions, i.e. it shows you exactly what was defined in the AndroidManifest.xml.

Implied permissions, which are granted automatically, are not returned here. Use get_uses_implied_permission_list() if you need a list of implied permissions.

Returns:A list of permissions
Return type:list
get_providers()

Return the android:name attribute of all providers

Return type:a list of string
get_public_keys_der_v2()

Return a list of DER coded X.509 public keys from the v3 signature block

get_public_keys_der_v3()

Return a list of DER coded X.509 public keys from the v3 signature block

get_public_keys_v2()

Return a list of asn1crypto.keys.PublicKeyInfo which are found in the v2 signing block.

get_public_keys_v3()

Return a list of asn1crypto.keys.PublicKeyInfo which are found in the v3 signing block.

get_raw()

Return raw bytes of the APK

Return type:bytes
get_receivers()

Return the android:name attribute of all receivers

Return type:a list of string
get_requested_aosp_permissions()

Returns requested permissions declared within AOSP project.

This includes several other permissions as well, which are in the platform apps.

Return type:list of str
get_requested_aosp_permissions_details()

Returns requested aosp permissions with details.

Return type:dictionary
get_requested_permissions()

Deprecated since version 3.1.0: use get_permissions() instead.

Returns all requested permissions.

It has the same result as get_permissions() and might be removed in the future

Return type:list of str
get_requested_third_party_permissions()

Returns list of requested permissions not declared within AOSP project.

Return type:list of strings
get_services()

Return the android:name attribute of all services

Return type:a list of str
get_signature()

Return the data of the first signature file found (v1 Signature / JAR Signature)

Return type:First signature name or None if not signed
get_signature_name()

Return the name of the first signature file found.

get_signature_names()

Return a list of the signature file names (v1 Signature / JAR Signature)

Return type:List of filenames matching a Signature
get_signatures()

Return a list of the data of the signature files. Only v1 / JAR Signing.

Return type:list of bytes
get_target_sdk_version()

Return the android:targetSdkVersion attribute

Return type:string
get_uses_implied_permission_list()

Return all permissions implied by the target SDK or other permissions.

Return type:list of string
get_value_from_tag(tag, attribute)

Return the value of the android prefixed attribute in a specific tag.

This function will always try to get the attribute with a android: prefix first, and will try to return the attribute without the prefix, if the attribute could not be found. This is useful for some broken AndroidManifest.xml, where no android namespace is set, but could also indicate malicious activity (i.e. wrongly repackaged files). A warning is printed if the attribute is found without a namespace prefix.

If you require to get the exact result you need to query the tag directly:

example::
>>> from lxml.etree import Element
>>> tag = Element('bar', nsmap={'android': 'http://schemas.android.com/apk/res/android'})
>>> tag.set('{http://schemas.android.com/apk/res/android}foobar', 'barfoo')
>>> tag.set('name', 'baz')
# Assume that `a` is some APK object
>>> a.get_value_from_tag(tag, 'name')
'baz'
>>> tag.get('name')
'baz'
>>> tag.get('foobar')
None
>>> a.get_value_from_tag(tag, 'foobar')
'barfoo'
Parameters:
  • tag (lxml.etree.Element) – specify the tag element
  • attribute (str) – specify the attribute name
Returns:

the attribute’s value, or None if the attribute is not present

is_androidtv()

Checks if this application does not require a touchscreen, as this is the rule to get into the TV section of the Play Store See: https://developer.android.com/training/tv/start/start.html for more information.

Returns:True if ‘android.hardware.touchscreen’ is not required, False otherwise
is_leanback()

Checks if this application is build for TV (Leanback support) by checkin if it uses the feature ‘android.software.leanback’

Returns:True if leanback feature is used, false otherwise
is_multidex()

Test if the APK has multiple DEX files

Returns:True if multiple dex found, otherwise False
is_signed()

Returns true if either a v1 or v2 (or both) signature was found.

is_signed_v1()

Returns true if a v1 / JAR signature was found.

Returning True does not mean that the file is properly signed! It just says that there is a signature file which needs to be validated.

is_signed_v2()

Returns true of a v2 / APK signature was found.

Returning True does not mean that the file is properly signed! It just says that there is a signature file which needs to be validated.

is_signed_v3()

Returns true of a v3 / APK signature was found.

Returning True does not mean that the file is properly signed! It just says that there is a signature file which needs to be validated.

is_tag_matched(tag, **attribute_filter)

Return true if the attributes matches in attribute filter.

An attribute filter is a dictionary containing: {attribute_name: value}. This function will return True if and only if all attributes have the same value. This function allows to set the dictionary via kwargs, thus you can filter like this:

example::
a.is_tag_matched(tag, name=”foobar”, other=”barfoo”)

This function uses a fallback for attribute searching. It will by default use the namespace variant but fall back to the non-namespace variant. Thus specifiying {"name": "foobar"} will match on <bla name="foobar" \> as well as on <bla android:name="foobar" \>.

Parameters:
  • tag (lxml.etree.Element) – specify the tag element
  • attribute_filter – specify the attribute filter as dictionary
is_valid_APK()

Return true if the APK is valid, false otherwise. An APK is seen as valid, if the AndroidManifest.xml could be successful parsed. This does not mean that the APK has a valid signature nor that the APK can be installed on an Android system.

Return type:boolean
is_wearable()

Checks if this application is build for wearables by checking if it uses the feature ‘android.hardware.type.watch’ See: https://developer.android.com/training/wearables/apps/creating.html for more information.

Not every app is setting this feature (not even the example Google provides), so it might be wise to not 100% rely on this feature.

Returns:True if wearable, False otherwise
new_zip(filename, deleted_files=None, new_files={})

Create a new zip file

Parameters:
  • filename (string) – the output filename of the zip
  • deleted_files (None or a string) – a regex pattern to remove specific file
  • new_files (a dictionnary (key:filename, value:content of the file)) – a dictionnary of new files
parse_signatures_or_digests(digest_bytes)

Parse digests

parse_v2_signing_block()

Parse the V2 signing block and extract all features

parse_v2_v3_signature()
parse_v3_signing_block()

Parse the V2 signing block and extract all features

read_uint32_le(io_stream)
show()
class androguard.core.bytecodes.apk.APKV2SignedData

Bases: object

This class holds all data associated with an APK V3 SigningBlock signed data. source : https://source.android.com/security/apksigning/v2.html

class androguard.core.bytecodes.apk.APKV2Signer

Bases: object

This class holds all data associated with an APK V2 SigningBlock signer. source : https://source.android.com/security/apksigning/v2.html

class androguard.core.bytecodes.apk.APKV3SignedData

Bases: androguard.core.bytecodes.apk.APKV2SignedData

This class holds all data associated with an APK V3 SigningBlock signed data. source : https://source.android.com/security/apksigning/v3.html

class androguard.core.bytecodes.apk.APKV3Signer

Bases: androguard.core.bytecodes.apk.APKV2Signer

This class holds all data associated with an APK V3 SigningBlock signer. source : https://source.android.com/security/apksigning/v3.html

exception androguard.core.bytecodes.apk.BrokenAPKError

Bases: androguard.core.bytecodes.apk.Error

exception androguard.core.bytecodes.apk.Error

Bases: Exception

Base class for exceptions in this module.

exception androguard.core.bytecodes.apk.FileNotPresent

Bases: androguard.core.bytecodes.apk.Error

androguard.core.bytecodes.apk.ensure_final_value(packageName, arsc, value)

Ensure incoming value is always the value, not the resid

androguard will sometimes return the Android “resId” aka Resource ID instead of the actual value. This checks whether the value is actually a resId, then performs the Android Resource lookup as needed.

androguard.core.bytecodes.apk.get_apkid(apkfile)

Read (appid, versionCode, versionName) from an APK

This first tries to do quick binary XML parsing to just get the values that are needed. It will fallback to full androguard parsing, which is slow, if it can’t find the versionName value or versionName is set to a Android String Resource (e.g. an integer hex value that starts with @).

androguard.core.bytecodes.apk.parse_lxml_dom(tree)
androguard.core.bytecodes.apk.show_Certificate(cert, short=False)

Print Fingerprints, Issuer and Subject of an X509 Certificate.

Parameters:
  • cert (asn1crypto.x509.Certificate) – X509 Certificate to print
  • short (Boolean) – Print in shortform for DN (Default: False)

androguard.core.bytecodes.dvm module

class androguard.core.bytecodes.dvm.AnnotationElement(buff, cm)

Bases: object

This class can parse an annotation_element of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_element
  • cm (ClassManager) – a ClassManager object
get_length()
get_name_idx()

Return the element name, represented as an index into the string_ids section

Return type:int
get_obj()
get_raw()
get_value()

Return the element value (EncodedValue)

Return type:a EncodedValue object
show()
class androguard.core.bytecodes.dvm.AnnotationItem(buff, cm)

Bases: object

This class can parse an annotation_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_item
  • cm (ClassManager) – a ClassManager object
get_annotation()

Return the encoded annotation contents

Return type:a EncodedAnnotation object
get_length()
get_obj()
get_off()
get_raw()
get_visibility()

Return the intended visibility of this annotation

Return type:int
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.AnnotationOffItem(buff, cm)

Bases: object

This class can parse an annotation_off_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_off_item
  • cm (ClassManager) – a ClassManager object
get_annotation_off()
get_length()
get_obj()
get_raw()
show()
class androguard.core.bytecodes.dvm.AnnotationSetItem(buff, cm)

Bases: object

This class can parse an annotation_set_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_set_item
  • cm (ClassManager) – a ClassManager object
get_annotation_off_item()

Return the offset from the start of the file to an annotation

Return type:a list of AnnotationOffItem
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.AnnotationSetRefItem(buff, cm)

Bases: object

This class can parse an annotation_set_ref_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_set_ref_item
  • cm (ClassManager) – a ClassManager object
get_annotations_off()

Return the offset from the start of the file to the referenced annotation set or 0 if there are no annotations for this element.

Return type:int
get_obj()
get_raw()
show()
class androguard.core.bytecodes.dvm.AnnotationSetRefList(buff, cm)

Bases: object

This class can parse an annotation_set_ref_list_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotation_set_ref_list_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_list()

Return elements of the list

Return type:AnnotationSetRefItem
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.AnnotationsDirectoryItem(buff, cm)

Bases: object

This class can parse an annotations_directory_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the annotations_directory_item
  • cm (ClassManager) – a ClassManager object
get_annotated_fields_size()

Return the count of fields annotated by this item

Return type:int
get_annotated_methods_size()

Return the count of methods annotated by this item

Return type:int
get_annotated_parameters_size()

Return the count of method parameter lists annotated by this item

Return type:int
get_class_annotations_off()

Return the offset from the start of the file to the annotations made directly on the class, or 0 if the class has no direct annotations

Return type:int
get_field_annotations()

Return the list of associated field annotations

Return type:a list of FieldAnnotation
get_length()
get_method_annotations()

Return the list of associated method annotations

Return type:a list of MethodAnnotation
get_obj()
get_off()
get_parameter_annotations()

Return the list of associated method parameter annotations

Return type:a list of ParameterAnnotation
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.ClassDataItem(buff, cm)

Bases: object

This class can parse a class_data_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the class_data_item
  • cm (ClassManager) – a ClassManager object
get_direct_methods()

Return the defined direct (any of static, private, or constructor) methods, represented as a sequence of encoded elements

Return type:a list of EncodedMethod objects
get_direct_methods_size()

Return the number of direct methods defined in this item

Return type:int
get_fields()

Return static and instance fields

Return type:a list of EncodedField objects
get_instance_fields()

Return the defined instance fields, represented as a sequence of encoded elements

Return type:a list of EncodedField objects
get_instance_fields_size()

Return the number of instance fields defined in this item

Return type:int
get_length()
get_methods()

Return direct and virtual methods

Return type:a list of EncodedMethod objects
get_obj()
get_off()
get_raw()
get_static_fields()

Return the defined static fields, represented as a sequence of encoded elements

Return type:a list of EncodedField objects
get_static_fields_size()

Return the number of static fields defined in this item

Return type:int
get_virtual_methods()

Return the defined virtual (none of static, private, or constructor) methods, represented as a sequence of encoded elements

Return type:a list of EncodedMethod objects
get_virtual_methods_size()

Return the number of virtual methods defined in this item

Return type:int
reload()
set_off(off)
set_static_fields(value)
show()
class androguard.core.bytecodes.dvm.ClassDefItem(buff, cm)

Bases: object

This class can parse a class_def_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the class_def_item
  • cm (ClassManager) – a ClassManager object
get_access_flags()

Return the access flags for the class (public, final, etc.)

Return type:int
get_access_flags_string()

Return the access flags string of the class

Return type:str
get_annotations_off()

Return the offset from the start of the file to the annotations structure for this class, or 0 if there are no annotations on this class.

Return type:int
get_ast()
get_class_data()

Return the associated class_data_item

Return type:a ClassDataItem object
get_class_data_off()

Return the offset from the start of the file to the associated class data for this item, or 0 if there is no class data for this class

Return type:int
get_class_idx()

Return the index into the type_ids list for this class

Return type:int
get_fields()

Return all fields of this class

Return type:a list of EncodedField objects
get_interfaces()

Return the name of the interface

Return type:str
get_interfaces_off()

Return the offset from the start of the file to the list of interfaces, or 0 if there are none

Return type:int
get_length()
get_methods()

Return all methods of this class

Return type:a list of EncodedMethod objects
get_name()

Return the name of this class

Return type:str
get_obj()
get_raw()
get_source()
get_source_ext()
get_source_file_idx()

Return the index into the string_ids list for the name of the file containing the original source for (at least most of) this class, or the special value NO_INDEX to represent a lack of this information

Return type:int
get_static_values_off()

Return the offset from the start of the file to the list of initial values for static fields, or 0 if there are none (and all static fields are to be initialized with 0 or null)

Return type:int
get_superclass_idx()

Return the index into the type_ids list for the superclass

Return type:int
get_superclassname()

Return the name of the super class

Return type:str
reload()
set_name(value)
show()
source()

Return the source code of the entire class

Return type:string
class androguard.core.bytecodes.dvm.ClassHDefItem(size, buff, cm)

Bases: object

This class can parse a list of class_def_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the list of class_def_item
  • cm (ClassManager) – a ClassManager object
get_class_idx(idx)
get_length()
get_method(name_class, name_method)
get_names()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.ClassManager(vm, config)

Bases: object

This class is used to access to all elements (strings, type, proto …) of the dex format based on their offset or index.

add_type_item(type_item, c_item, item)
get_all_engine()

Deprecated since version 3.3.5: do not use this function anymore!

get_ascii_string(s)
get_class_data_item(off)
get_code(idx)
get_debug_off(off)
get_encoded_array_item(off)
get_engine()

Deprecated since version 3.3.5: do not use this function anymore!

get_field(idx)
get_field_ref(idx)
get_item_by_offset(offset)
get_lazy_analysis()

Deprecated since version 3.3.5: do not use this function anymore!

get_method(idx)
get_method_ref(idx)
get_next_offset_item(idx)
get_obj_by_offset(offset)

Returnes a object from as given offset inside the DEX file

get_odex_format()

Returns True if the underlying VM is ODEX

get_proto(idx)
get_raw_string(idx)

Return the (unprocessed) string from the string table at index idx.

Parameters:idx (int) – the index in the string section
get_string(idx)

Return a string from the string table at index idx

Parameters:idx (int) – index in the string section
get_string_by_offset(offset)
get_type(idx)

Return the resolved type name based on the index

Parameters:idx (int) –
Returns:the type name
Return type:str
get_type_list(off)
get_type_ref(idx)
set_decompiler(decompiler)
set_hook_class_name(class_def, value)
set_hook_field_name(encoded_field, value)
set_hook_method_name(encoded_method, value)
set_hook_string(idx, value)
class androguard.core.bytecodes.dvm.CodeItem(size, buff, cm)

Bases: object

get_code(off)
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.ConstString(orig_ins, value)

Bases: androguard.core.bytecodes.dvm.Instruction21c

Simulate a const-string instruction.

get_operands(idx=-1)

Return all operands

Return type:list
get_raw_string()
class androguard.core.bytecodes.dvm.DBGBytecode(cm, op_value)

Bases: object

add(value, ttype)
get_obj()
get_op_value()
get_raw()
get_value()
show()
class androguard.core.bytecodes.dvm.DCode(class_manager, offset, size, buff)

Bases: object

This class represents the instructions of a method

Parameters:
  • class_manager (ClassManager object) – the ClassManager
  • offset (int) – the offset of the buffer
  • size (int) – the total size of the buffer
  • buff (string) – a raw buffer where are the instructions
add_inote(msg, idx, off=None)

Add a message to a specific instruction by using (default) the index of the address if specified

Parameters:
  • msg (string) – the message
  • idx (int) – index of the instruction (the position in the list of the instruction)
  • off (int) – address of the instruction
get_ins_off(off)

Get a particular instruction by using the address

Parameters:off (int) – address of the instruction
Return type:an Instruction object
get_insn()

Get the insn buffer

Return type:string
get_instruction(idx, off=None)

Get a particular instruction by using (default) the index of the address if specified

Parameters:
  • idx (int) – index of the instruction (the position in the list of the instruction)
  • off (int) – address of the instruction
Return type:

an Instruction object

get_instructions()

Get the instructions

Return type:a generator of each Instruction (or a cached list of instructions if you have setup instructions)
get_length()

Return the length of this object

Return type:int
get_raw()

Return the raw buffer of this object

Return type:bytearray
is_cached_instructions()
off_to_pos(off)

Get the position of an instruction by using the address

Parameters:off (int) – address of the instruction
Return type:int
reload()
set_idx(idx)

Set the start address of the buffer

Parameters:idx (int) – the index
set_insn(insn)

Set a new raw buffer to disassemble

Parameters:insn (string) – the buffer
set_instructions(instructions)

Set the instructions

Parameters:instructions (a list of Instruction) – the list of instructions
show()

Display (with a pretty print) this object

class androguard.core.bytecodes.dvm.DalvikCode(buff, cm)

Bases: object

This class represents the instructions of a method

Parameters:
  • buff (string) – a raw buffer where are the instructions
  • cm (ClassManager object) – the ClassManager
add_inote(msg, idx, off=None)

Add a message to a specific instruction by using (default) the index of the address if specified

Parameters:
  • msg (string) – the message
  • idx (int) – index of the instruction (the position in the list of the instruction)
  • off (int) – address of the instruction
get_bc()

Return the associated code object

Return type:DCode
get_debug()

Return the associated debug object

Return type:DebugInfoItem
get_debug_info_off()

Get the offset from the start of the file to the debug info (line numbers + local variable info) sequence for this code, or 0 if there simply is no information

Return type:int
get_handlers()

Get the bytes representing a list of lists of catch types and associated handler addresses.

Return type:EncodedCatchHandlerList
get_ins_size()

Get the number of words of incoming arguments to the method that this code is for

Return type:int
get_insns_size()

Get the size of the instructions list, in 16-bit code units

Return type:int
get_instruction(idx, off=None)
get_length()
get_obj()
get_off()
get_outs_size()

Get the number of words of outgoing argument space required by this code for method invocation

Return type:int
get_raw()

Get the reconstructed code as bytearray

Return type:bytearray
get_registers_size()

Get the number of registers used by this code

Return type:int
get_size()
get_tries()

Get the array indicating where in the code exceptions are caught and how to handle them

Return type:a list of TryItem objects
get_tries_size()

Get the number of TryItem for this instance

Return type:int
reload()
set_idx(idx)
set_off(off)
show()
class androguard.core.bytecodes.dvm.DalvikOdexVMFormat(buff, decompiler=None, config=None, using_api=None)

Bases: androguard.core.bytecodes.dvm.DalvikVMFormat

This class can parse an odex file

Parameters:
  • buff (string) – a string which represents the odex file
  • decompiler (object) – associate a decompiler object to display the java source code
Example:

DalvikOdexVMFormat( read(“classes.odex”) )

get_buff()

Return the whole buffer

Return type:bytearray
get_dependencies()

Return the odex dependencies object

Return type:an OdexDependencies object
get_format_type()

Return the type

Return type:a string
save()

Do not use !

class androguard.core.bytecodes.dvm.DalvikVMFormat(buff, decompiler=None, config=None, using_api=None)

Bases: androguard.core.bytecode.BuffHandle

This class can parse a classes.dex file of an Android application (APK).

Parameters:
  • buff (string) – a string which represents the classes.dex file
  • decompiler (object) – associate a decompiler object to display the java source code

example:

d = DalvikVMFormat( read("classes.dex") )
colorize_operands(operands, colors)
create_python_export()

Export classes/methods/fields’ names in the python namespace

disassemble(offset, size)

Disassembles a given offset in the DEX file

Parameters:
  • offset (int) – offset to disassemble in the file (from the beginning of the file)
  • size
fix_checksums(buff)

Fix a dex format buffer by setting all checksums

Return type:string
get_BRANCH_DVM_OPCODES()
get_all_fields()

Return a list of field items

Return type:a list of FieldIdItem objects
get_api_version()

This method returns api version that should be used for loading api specific resources.

Return type:int
get_class(name)

Return a specific class

Parameters:name – the name of the class
Return type:a ClassDefItem object
get_class_manager()

This function returns a ClassManager object which allow you to get access to all index references (strings, methods, fields, ….)

Return type:ClassManager object
get_classes()

Return all classes

Return type:a list of ClassDefItem objects
get_classes_def_item()

This function returns the class def item

Return type:ClassHDefItem object
get_classes_names(update=False)

Return the names of classes

Parameters:update – True indicates to recompute the list. Maybe needed after using a MyClass.set_name().
Return type:a list of string
get_cm_field(idx)

Get a specific field by using an index

Parameters:idx (int) – index of the field
get_cm_method(idx)

Get a specific method by using an index

Parameters:idx (int) – index of the method
get_cm_string(idx)

Get a specific string by using an index

Parameters:idx (int) – index of the string
get_cm_type(idx)

Get a specific type by using an index

Parameters:idx (int) – index of the type
get_codes_item()

This function returns the code item

Return type:CodeItem object
get_debug_info_item()

This function returns the debug info item

Return type:DebugInfoItem object
get_determineException()
get_determineNext()
get_field(name)

Return a list all fields which corresponds to the regexp

Parameters:name – the name of the field (a python regexp)
Return type:a list with all EncodedField objects
get_field_descriptor(class_name, field_name, descriptor)

Return the specific field

Parameters:
  • class_name (string) – the class name of the field
  • field_name (string) – the name of the field
  • descriptor (string) – the descriptor of the field
Return type:

None or a EncodedField object

get_fields()

Return all field objects

Return type:a list of EncodedField objects
get_fields_class(class_name)

Return all fields of a specific class

Parameters:class_name (string) – the class name
Return type:a list with EncodedField objects
get_fields_id_item()

This function returns the field id item

Return type:FieldHIdItem object
get_format()
get_format_type()

Return the type

Return type:a string
get_header_item()

This function returns the header item

Return type:HeaderItem object
get_len_methods()

Return the number of methods

Return type:int
get_method(name)

Return a list all methods which corresponds to the regexp

Parameters:name – the name of the method (a python regexp)
Return type:a list with all EncodedMethod objects
get_method_by_idx(idx)

Return a specific method by using an index :param idx: the index of the method :type idx: int

Return type:None or an EncodedMethod object
get_method_descriptor(class_name, method_name, descriptor)

Return the specific method

Parameters:
  • class_name (string) – the class name of the method
  • method_name (string) – the name of the method
  • descriptor (string) – the descriptor of the method
Return type:

None or a EncodedMethod object

get_methods()

Return all method objects

Return type:a list of EncodedMethod objects
get_methods_class(class_name)

Return all methods of a specific class

Parameters:class_name (string) – the class name
Return type:a list with EncodedMethod objects
get_methods_descriptor(class_name, method_name)

Return the specific methods of the class

Parameters:
  • class_name (string) – the class name of the method
  • method_name (string) – the name of the method
Return type:

None or a EncodedMethod object

get_methods_id_item()

This function returns the method id item

Return type:MethodHIdItem object
get_operand_html(operand, registers_colors, colors, escape_fct, wrap_fct)
get_regex_strings(regular_expressions)

Return all target strings matched the regex

Parameters:regular_expressions (string) – the python regex
Return type:a list of strings matching the regex expression
get_string_data_item()

This function returns the string data item

Return type:StringDataItem object
get_strings()

Return all strings

The strings will have escaped surrogates, if only a single high or low surrogate is found. Complete surrogates are put together into the representing 32bit character.

Return type:a list with all strings used in the format (types, names …)
get_strings_unicode()

Return all strings

This method will return pure UTF-16 strings. This is the “exact” same string as used in Java. Those strings can be problematic for python, as they can contain surrogates as well as “broken” surrogate pairs, ie single high or low surrogates. Such a string can for example not be printed. To avoid such problems, there is an escape mechanism to detect such lonely surrogates and escape them in the string. Of course, this results in a different string than in the Java Source!

Use get_strings() as a general purpose and get_strings_unicode() if you require the exact string from the Java Source. You can always escape the string from get_strings_unicode() using the function androguard.core.bytecodes.mutf8.patch_string()

Return type:a list with all strings used in the format (types, names …)
get_vmanalysis()

Deprecated since version 3.1.0: The Analysis is not loaded anymore into DalvikVMFormat in order to avoid cyclic dependencies. Analysis extends now DalvikVMFormat. This Method does nothing anymore!

The Analysis Object should contain all the information required, inclduing the DalvikVMFormats.

list_classes_hierarchy()
print_classes_hierarchy()
save()

Return the dex (with the modifications) into raw format (fix checksums) (beta: do not use !)

Return type:string
set_decompiler(decompiler)
set_vmanalysis(analysis)

Deprecated since version 3.1.0: The Analysis is not loaded anymore into DalvikVMFormat in order to avoid cyclic dependencies. Analysis extends now DalvikVMFormat. This Method does nothing anymore!

The Analysis Object should contain all the information required, inclduing the DalvikVMFormats.

show()

Show the all information in the object

class androguard.core.bytecodes.dvm.DebugInfoItem(buff, cm)

Bases: object

get_bytecodes()
get_line_start()
get_off()
get_parameter_names()
get_parameters_size()
get_raw()
get_translated_parameter_names()
reload()
show()
class androguard.core.bytecodes.dvm.DebugInfoItemEmpty(buff, cm)

Bases: object

get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.EncodedAnnotation(buff, cm)

Bases: object

This class can parse an encoded_annotation of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_annotation
  • cm (ClassManager) – a ClassManager object
get_elements()

Return the elements of the annotation, represented directly in-line (not as offsets)

Return type:a list of AnnotationElement objects
get_length()
get_obj()
get_raw()
get_size()

Return the number of name-value mappings in this annotation

:rtype:int

get_type_idx()

Return the type of the annotation. This must be a class (not array or primitive) type

Return type:int
show()
class androguard.core.bytecodes.dvm.EncodedArray(buff, cm)

Bases: object

This class can parse an encoded_array of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_array
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_raw()
get_size()

Return the number of elements in the array

Return type:int
get_values()

Return a series of size encoded_value byte sequences in the format specified by this section, concatenated sequentially

Return type:a list of EncodedValue objects
show()
class androguard.core.bytecodes.dvm.EncodedArrayItem(buff, cm)

Bases: object

This class can parse an encoded_array_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_array_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_off()
get_raw()
get_value()

Return the bytes representing the encoded array value

Return type:a EncodedArray object
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.EncodedCatchHandler(buff, cm)

Bases: object

This class can parse an encoded_catch_handler of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_catch_handler
  • cm (ClassManager) – a ClassManager object
get_catch_all_addr()

Return the bytecode address of the catch-all handler. This element is only present if size is non-positive.

Return type:int
get_handlers()

Return the stream of abs(size) encoded items, one for each caught type, in the order that the types should be tested.

Return type:a list of EncodedTypeAddrPair objects
get_length()
get_off()
get_raw()
Return type:bytearray
get_size()

Return the number of catch types in this list

Return type:int
set_off(off)
show()
class androguard.core.bytecodes.dvm.EncodedCatchHandlerList(buff, cm)

Bases: object

This class can parse an encoded_catch_handler_list of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_catch_handler_list
  • cm (ClassManager) – a ClassManager object
get_length()
get_list()

Return the actual list of handler lists, represented directly (not as offsets), and concatenated sequentially

Return type:a list of EncodedCatchHandler objects
get_obj()
get_off()
get_raw()
Return type:bytearray
get_size()

Return the size of this list, in entries

Return type:int
set_off(off)
show()
class androguard.core.bytecodes.dvm.EncodedField(buff, cm)

Bases: object

This class can parse an encoded_field of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded field
  • cm (ClassManager) – a ClassManager object
adjust_idx(val)
get_access_flags()

Return the access flags of the field

Return type:int
get_access_flags_string()

Return the access flags string of the field

Return type:string
get_class_name()

Return the class name of the field

Return type:string
get_descriptor()

Return the descriptor of the field

The descriptor of a field is the type of the field.

Return type:string
get_field_idx()

Return the real index of the method

Return type:int
get_field_idx_diff()

Return the index into the field_ids list for the identity of this field (includes the name and descriptor), represented as a difference from the index of previous element in the list

Return type:int
get_init_value()

Return the init value object of the field

Return type:EncodedValue
get_name()

Return the name of the field

Return type:string
get_obj()
get_raw()
get_size()
load()
reload()
set_init_value(value)

Setup the init value object of the field

Parameters:value (EncodedValue) – the init value
set_name(value)
show()

Display the information (with a pretty print) about the field

class androguard.core.bytecodes.dvm.EncodedMethod(buff, cm)

Bases: object

This class can parse an encoded_method of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_method
  • cm (ClassManager) – a ClassManager object
access_flags = None

access flags of the method

add_inote(msg, idx, off=None)

Add a message to a specific instruction by using (default) the index of the address if specified

Parameters:
  • msg (string) – the message
  • idx (int) – index of the instruction (the position in the list of the instruction)
  • off (int) – address of the instruction
add_note(msg)

Add a message to this method

Parameters:msg (string) – the message
adjust_idx(val)
code_off = None

offset of the code section

each_params_by_register(nb, proto)

From the Dalvik Bytecode documentation:

> The N arguments to a method land in the last N registers > of the method’s invocation frame, in order. > Wide arguments consume two registers. > Instance methods are passed a this reference as their first argument.

This method will print a description of the register usage to stdout.

Parameters:
  • nb – number of registers
  • proto – descriptor of method
get_access_flags()

Return the access flags of the method

Return type:int
get_access_flags_string()

Return the access flags string of the method

A description of all access flags can be found here: https://source.android.com/devices/tech/dalvik/dex-format#access-flags

Return type:string
get_address()

Return the offset from the start of the file to the code structure for this method, or 0 if this method is either abstract or native

Return type:int
get_class_name()

Return the class name of the method

Return type:string
get_code()

Return the code object associated to the method

Return type:DalvikCode object or None if no Code
get_code_off()

Return the offset from the start of the file to the code structure for this method, or 0 if this method is either abstract or native

Return type:int
get_debug()

Return the debug object associated to this method

Return type:DebugInfoItem
get_descriptor()

Return the descriptor of the method A method descriptor will have the form (A A A …)R Where A are the arguments to the method and R is the return type. Basic types will have the short form, i.e. I for integer, V for void and class types will be named like a classname, e.g. Ljava/lang/String;.

Typical descriptors will look like this: ` (I)I   // one integer argument, integer return (C)Z   // one char argument, boolean as return (Ljava/lang/CharSequence; I)I   // CharSequence and integer as argyument, integer as return (C)Ljava/lang/String;  // char as argument, String as return. `

More information about type descriptors are found here: https://source.android.com/devices/tech/dalvik/dex-format#typedescriptor

Return type:string
get_information()
get_instruction(idx, off=None)

Get a particular instruction by using (default) the index of the address if specified

Parameters:
  • idx (int) – index of the instruction (the position in the list of the instruction)
  • off (int) – address of the instruction
Return type:

an Instruction object

get_instructions()

Get the instructions

Return type:a generator of each Instruction (or a cached list of instructions if you have setup instructions)
get_length()

Return the length of the associated code of the method

Return type:int
get_locals()
get_method_idx()

Return the real index of the method

Return type:int
get_method_idx_diff()

Return index into the method_ids list for the identity of this method (includes the name and descriptor), represented as a difference from the index of previous element in the lis

Return type:int
get_name()

Return the name of the method

Return type:string
get_raw()
get_short_string()

Return a shorter formatted String which encodes this method. The returned name has the form: <classname> <methodname> ([arguments …])<returntype>

  • All Class names are condensed to the actual name (no package).
  • Access flags are not returned.
  • <init> and <clinit> are NOT replaced by the classname!

This name might not be unique!

Returns:str
get_size()
get_source()
get_triple()
is_cached_instructions()
load()
method_idx_diff = None

method index diff in the corresponding section

reload()
set_code_idx(idx)

Set the start address of the buffer to disassemble

Parameters:idx (int) – the index
set_instructions(instructions)

Set the instructions

Parameters:instructions (a list of Instruction) – the list of instructions
set_name(value)
show()

Display the information (with a pretty print) about the method

show_info()

Display the basic information about the method

show_notes()

Display the notes about the method

source()

Return the source code of this method

Return type:string
class androguard.core.bytecodes.dvm.EncodedTypeAddrPair(buff)

Bases: object

This class can parse an encoded_type_addr_pair of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_type_addr_pair
  • cm (ClassManager) – a ClassManager object
get_addr()

Return the bytecode address of the associated exception handler

Return type:int
get_length()
get_obj()
get_raw()
get_type_idx()

Return the index into the type_ids list for the type of the exception to catch

Return type:int
show()
class androguard.core.bytecodes.dvm.EncodedValue(buff, cm)

Bases: object

This class can parse an encoded_value of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the encoded_value
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_raw()
get_value()

Return the bytes representing the value, variable in length and interpreted differently for different value_type bytes, though always little-endian

Return type:an object representing the value
get_value_arg()
get_value_type()
show()
exception androguard.core.bytecodes.dvm.Error

Bases: Exception

Base class for exceptions in this module.

class androguard.core.bytecodes.dvm.ExportObject

Bases: object

Wrapper object for ipython exports

class androguard.core.bytecodes.dvm.FakeNop(length)

Bases: androguard.core.bytecodes.dvm.Instruction10x

Simulate a nop instruction.

get_length()

Return the length of the instruction

Return type:int
class androguard.core.bytecodes.dvm.FieldAnnotation(buff, cm)

Bases: object

This class can parse a field_annotation of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the field_annotation
  • cm (ClassManager) – a ClassManager object
get_annotations_off()

Return the offset from the start of the file to the list of annotations for the field

Return type:int
get_field_idx()

Return the index into the field_ids list for the identity of the field being annotated

Return type:int
get_length()
get_obj()
get_off()
get_raw()
set_off(off)
show()
class androguard.core.bytecodes.dvm.FieldHIdItem(size, buff, cm)

Bases: object

This class can parse a list of field_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the list of field_id_item
  • cm (ClassManager) – a ClassManager object
get(idx)
get_length()
get_obj()
get_off()
get_raw()
gets()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.FieldIdItem(buff, cm)

Bases: object

This class can parse a field_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the field_id_item
  • cm (ClassManager) – a ClassManager object
get_class_idx()

Return the index into the type_ids list for the definer of this field

Return type:int
get_class_name()

Return the class name of the field

Return type:string
get_descriptor()

Return the descriptor of the field

Return type:string
get_length()
get_list()
get_name()

Return the name of the field

Return type:string
get_name_idx()

Return the index into the string_ids list for the name of this field

Return type:int
get_obj()
get_raw()
get_type()

Return the type of the field

Return type:string
get_type_idx()

Return the index into the type_ids list for the type of this field

Return type:int
reload()
show()
class androguard.core.bytecodes.dvm.FieldIdItemInvalid

Bases: object

get_class_name()
get_descriptor()
get_list()
get_name()
get_type()
show()
class androguard.core.bytecodes.dvm.FillArrayData(buff)

Bases: object

This class can parse a FillArrayData instruction

Parameters:buff – a Buff object which represents a buffer where the instruction is stored
add_note(msg)

Add a note to this instruction

Parameters:msg (objects (string)) – the message
get_data()

Return the data of this instruction (the payload)

Return type:string
get_formatted_operands()
get_hex()

Returns a HEX String, separated by spaces every byte

get_length()

Return the length of the instruction

Return type:int
get_name()

Return the name of the instruction

Return type:string
get_notes()

Get all notes from this instruction

Return type:a list of objects
get_op_value()

Get the value of the opcode

Return type:int
get_operands(idx=-1)
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()
show(pos)

Print the instruction

show_buff(pos)

Return the display of the instruction

Return type:string
class androguard.core.bytecodes.dvm.HeaderItem(size, buff, cm)

Bases: object

This class can parse an header_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the header_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.Instruction

Bases: object

This class represents a dalvik instruction

get_formatted_operands()
get_hex()

Returns a HEX String, separated by spaces every byte

get_kind()

Return the ‘kind’ argument of the instruction

Return type:int
get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_name()

Return the name of the instruction

Return type:string
get_op_value()

Return the value of the opcode

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
get_translated_kind()

Return the translated value of the ‘kind’ argument

Return type:string
show(idx)

Print the instruction

show_buff(idx)

Return the display of the instruction

Return type:string
class androguard.core.bytecodes.dvm.Instruction10t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 10t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction10x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 10x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction11n(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 11n format

get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction11x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 11x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction12x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 12x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction20bc(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 20bc format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction20t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 20t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction21c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 21c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_raw_string()
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
get_string()
class androguard.core.bytecodes.dvm.Instruction21h(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 21h format

get_formatted_operands()
get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction21s(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 21s format

get_formatted_operands()
get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction21t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 21t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction22b(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22b format

get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction22c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction22cs(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22cs format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction22s(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22s format

get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction22t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction22x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 22x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction23x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 23x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction30t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 30t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction31c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 31c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_raw_string()
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
get_string()

Return the string associated to the ‘kind’ argument

Return type:string
class androguard.core.bytecodes.dvm.Instruction31i(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 3li format

get_formatted_operands()
get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction31t(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 31t format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_off()
class androguard.core.bytecodes.dvm.Instruction32x(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 32x format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction35c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 35c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction35mi(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 35mi format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction35ms(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 35ms format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction3rc(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 3rc format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction3rmi(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 3rmi format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction3rms(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 3rms format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction40sc(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 40sc format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction41c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 41c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction51l(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 51l format

get_formatted_operands()
get_length()

Return the length of the instruction

Return type:int
get_literals()

Return the associated literals

Return type:list of int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
class androguard.core.bytecodes.dvm.Instruction52c(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 52c format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.Instruction5rc(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents all instructions which have the 5rc format

get_length()

Return the length of the instruction

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
get_ref_kind()

Return the value of the ‘kind’ argument

Return type:value
class androguard.core.bytecodes.dvm.InstructionInvalid(cm, buff)

Bases: androguard.core.bytecodes.dvm.Instruction

This class represents an invalid instruction

get_length()

Return the length of the instruction

Return type:int
get_name()

Return the name of the instruction

Return type:string
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
exception androguard.core.bytecodes.dvm.InvalidInstruction

Bases: androguard.core.bytecodes.dvm.Error

class androguard.core.bytecodes.dvm.LinearSweepAlgorithm

Bases: object

This class is used to disassemble a method. The algorithm used by this class is linear sweep.

get_instructions(cm, size, insn, idx)
Parameters:
  • cm (ClassManager object) – a ClassManager object
  • size (int) – the total size of the buffer
  • insn (string) – a raw buffer where are the instructions
  • idx (int) – a start address in the buffer
Return type:

a generator of Instruction objects

class androguard.core.bytecodes.dvm.MapItem(buff, cm)

Bases: object

get_item()
get_length()
get_obj()
get_off()

Gets the offset of the map item itself inside the DEX file

get_offset()

Gets the offset of the item of the map item

get_raw()
get_size()
get_type()
parse()
reload()
set_item(item)
show()
class androguard.core.bytecodes.dvm.MapList(cm, off, buff)

Bases: object

This class can parse the “map_list” of the dex format

https://source.android.com/devices/tech/dalvik/dex-format#map-list

get_class_manager()
get_item_type(ttype)

Get a particular item type

Parameters:ttype – a string which represents the desired type
Return type:None or the item object
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()

Print with a pretty display the MapList object

class androguard.core.bytecodes.dvm.MethodAnnotation(buff, cm)

Bases: object

This class can parse a method_annotation of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the method_annotation
  • cm (ClassManager) – a ClassManager object
get_annotations_off()

Return the offset from the start of the file to the list of annotations for the method

Return type:int
get_length()
get_method_idx()

Return the index into the method_ids list for the identity of the method being annotated

Return type:int
get_obj()
get_off()
get_raw()
set_off(off)
show()
class androguard.core.bytecodes.dvm.MethodHIdItem(size, buff, cm)

Bases: object

This class can parse a list of method_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the list of method_id_item
  • cm (ClassManager) – a ClassManager object
get(idx)
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.MethodIdItem(buff, cm)

Bases: object

This class can parse a method_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the method_id_item
  • cm (ClassManager) – a ClassManager object
get_class_idx()

Return the index into the type_ids list for the definer of this method

Return type:int
get_class_name()

Return the class name of the method

Return type:string
get_descriptor()

Return the descriptor

Return type:string
get_length()
get_list()
get_name()

Return the name of the method

Return type:string
get_name_idx()

Return the index into the string_ids list for the name of this method

Return type:int
get_obj()
get_proto()

Return the prototype of the method

Return type:string
get_proto_idx()

Return the index into the proto_ids list for the prototype of this method

Return type:int
get_raw()
get_real_descriptor()

Return the real descriptor (i.e. without extra spaces)

Return type:string
get_triple()
reload()
show()
class androguard.core.bytecodes.dvm.MethodIdItemInvalid

Bases: object

get_class_name()
get_descriptor()
get_list()
get_name()
get_proto()
show()
class androguard.core.bytecodes.dvm.OdexDependencies(buff)

Bases: object

This class can parse the odex dependencies

Parameters:buff – a Buff object string which represents the odex dependencies
get_dependencies()

Return the list of dependencies

Return type:a list of strings
get_raw()
class androguard.core.bytecodes.dvm.OdexHeaderItem(buff)

Bases: object

This class can parse the odex header

Parameters:buff – a Buff object string which represents the odex dependencies
get_raw()
show()
class androguard.core.bytecodes.dvm.OffObj(o)

Bases: object

class androguard.core.bytecodes.dvm.PackedSwitch(buff)

Bases: object

This class can parse a PackedSwitch instruction

Parameters:buff – a Buff object which represents a buffer where the instruction is stored
add_note(msg)

Add a note to this instruction

Parameters:msg (objects (string)) – the message
get_formatted_operands()
get_hex()

Returns a HEX String, separated by spaces every byte

get_keys()

Return the keys of the instruction

Return type:a list of long
get_length()
get_name()

Return the name of the instruction

Return type:string
get_notes()

Get all notes from this instruction

Return type:a list of objects
get_op_value()

Get the value of the opcode

Return type:int
get_operands(idx=-1)

Return an additional output of the instruction

Return type:string
get_output(idx=-1)

Return an additional output of the instruction

rtype:string
get_raw()
get_targets()

Return the targets (address) of the instruction

Return type:a list of long
get_values()
show(pos)

Print the instruction

show_buff(pos)

Return the display of the instruction

Return type:string
class androguard.core.bytecodes.dvm.ParameterAnnotation(buff, cm)

Bases: object

This class can parse a parameter_annotation of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the parameter_annotation
  • cm (ClassManager) – a ClassManager object
get_annotations_off()

Return the offset from the start of the file to the list of annotations for the method parameters

Return type:int
get_length()
get_method_idx()

Return the index into the method_ids list for the identity of the method whose parameters are being annotated

Return type:int
get_obj()
get_off()
get_raw()
set_off(off)
show()
class androguard.core.bytecodes.dvm.ProtoHIdItem(size, buff, cm)

Bases: object

This class can parse a list of proto_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the list of proto_id_item
  • cm (ClassManager) – a ClassManager object
get(idx)
get_length()
get_obj()
get_off()
get_raw()
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.ProtoIdItem(buff, cm)

Bases: object

This class can parse a proto_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the proto_id_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_parameters_off()

Return the offset from the start of the file to the list of parameter types for this prototype, or 0 if this prototype has no parameters

Return type:int
get_parameters_off_value()

Return the string associated to the parameters_off

Return type:string
get_raw()
get_return_type_idx()

Return the index into the type_ids list for the return type of this prototype

Return type:int
get_return_type_idx_value()

Return the string associated to the return_type_idx

Return type:string
get_shorty_idx()

Return the index into the string_ids list for the short-form descriptor string of this prototype

Return type:int
get_shorty_idx_value()

Return the string associated to the shorty_idx

Return type:string
reload()
show()
class androguard.core.bytecodes.dvm.ProtoIdItemInvalid

Bases: object

get_params()
get_return_type()
get_shorty()
show()
class androguard.core.bytecodes.dvm.SparseSwitch(buff)

Bases: object

This class can parse a SparseSwitch instruction

Parameters:buff – a Buff object which represents a buffer where the instruction is stored
add_note(msg)

Add a note to this instruction

Parameters:msg (objects (string)) – the message
get_formatted_operands()
get_hex()

Returns a HEX String, separated by spaces every byte

get_keys()

Return the keys of the instruction

Return type:a list of long
get_length()
get_name()

Return the name of the instruction

Return type:string
get_notes()

Get all notes from this instruction

Return type:a list of objects
get_op_value()

Get the value of the opcode

Return type:int
get_operands(idx=-1)

Return an additional output of the instruction

Return type:string
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()
get_targets()

Return the targets (address) of the instruction

Return type:a list of long
get_values()
show(pos)

Print the instruction

show_buff(pos)

Return the display of the instruction

Return type:string
class androguard.core.bytecodes.dvm.StringDataItem(buff, cm)

Bases: object

This class can parse a string_data_item of a dex file

Strings in Dalvik files might not be representable in python! This is due to the fact, that you can store any UTF-16 character inside a Dalvik file, but this string might not be decodeable in python as it can contain invalid surrogate-pairs.

To circumvent this issue, this class has different methods how to access the string. There are also some fallbacks implemented to make a “invalid” string printable in python. Dalvik uses MUTF-8 as encoding for the strings. This encoding has the advantage to allow for null terminated strings in UTF-8 encoding, as the null character maps to something else. Therefore you can use get_data() to retrieve the actual data of the string and can handle encoding yourself. Or you use get_unicode() to return a decoded UTF-16 string, which might cause problems during printing or saving. If you want a representation of the string, which should be printable in python you ca use get() which escapes invalid characters.

Parameters:
  • buff (BuffHandle) – a string which represents a Buff object of the string_data_item
  • cm (ClassManager) – a ClassManager object
get()

Returns a printable string. In this case, all lonely surrogates are escaped, thus are represented in the string as 6 characters: ud853 Valid surrogates are encoded as 32bit values, ie. 𤽜.

get_data()

Return a series of MUTF-8 code units (a.k.a. octets, a.k.a. bytes) followed by a byte of value 0

Return type:string
get_length()

Get the length of the raw string including the ULEB128 coded length and the null byte terminator

Returns:int
get_obj()
get_off()
get_raw()

Returns the raw string including the ULEB128 coded length and null byte string terminator

Returns:bytes
get_unicode()

Returns an Unicode String This is the actual string. Beware that some strings might be not decodeable with usual UTF-16 decoder, as they use surrogates that are not supported by python.

get_utf16_size()

Return the size of this string, in UTF-16 code units

:rtype:int

reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.StringIdItem(buff, cm)

Bases: object

This class can parse a string_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the string_id_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_off()
get_raw()
get_string_data_off()

Return the offset from the start of the file to the string data for this item

Return type:int
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.TryItem(buff, cm)

Bases: object

This class represents the try_item format

Parameters:
  • buff (string) – a raw buffer where are the try_item format
  • cm (ClassManager object) – the ClassManager
get_handler_off()

Get the offset in bytes from the start of the associated EncodedCatchHandlerList to the EncodedCatchHandler for this entry.

Return type:int
get_insn_count()

Get the number of 16-bit code units covered by this entry

Return type:int
get_length()
get_off()
get_raw()
get_start_addr()

Get the start address of the block of code covered by this entry. The address is a count of 16-bit code units to the start of the first covered instruction.

Return type:int
set_off(off)
class androguard.core.bytecodes.dvm.TypeHIdItem(size, buff, cm)

Bases: object

This class can parse a list of type_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the list of type_id_item
  • cm (ClassManager) – a ClassManager object
get(idx)
get_length()
get_obj()
get_off()
get_raw()
get_type()

Return the list of type_id_item

Return type:a list of TypeIdItem objects
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.TypeIdItem(buff, cm)

Bases: object

This class can parse a type_id_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the type_id_item
  • cm (ClassManager) – a ClassManager object
get_descriptor_idx()

Return the index into the string_ids list for the descriptor string of this type

Return type:int
get_descriptor_idx_value()

Return the string associated to the descriptor

Return type:string
get_length()
get_obj()
get_raw()
reload()
show()
class androguard.core.bytecodes.dvm.TypeItem(buff, cm)

Bases: object

This class can parse a type_item of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the type_item
  • cm (ClassManager) – a ClassManager object
get_length()
get_obj()
get_raw()
get_string()

Return the type string

Return type:string
get_type_idx()

Return the index into the type_ids list

Return type:int
show()
class androguard.core.bytecodes.dvm.TypeList(buff, cm)

Bases: object

This class can parse a type_list of a dex file

Parameters:
  • buff (Buff object) – a string which represents a Buff object of the type_list
  • cm (ClassManager) – a ClassManager object
get_length()
get_list()

Return the list of TypeItem

Return type:a list of TypeItem objects
get_obj()
get_off()
get_pad()

Return the alignment string

Return type:string
get_raw()
get_size()

Return the size of the list, in entries

Return type:int
get_string()

Return the concatenation of all strings

Return type:string
get_type_list_off()

Return the offset of the item

Return type:int
reload()
set_off(off)
show()
class androguard.core.bytecodes.dvm.Unresolved(cm, data)

Bases: androguard.core.bytecodes.dvm.Instruction

get_length()

Return the length of the instruction

Return type:int
get_name()

Return the name of the instruction

Return type:string
get_op_value()

Return the value of the opcode

Return type:int
get_operands(idx=-1)

Return all operands

Return type:list
get_output(idx=-1)

Return an additional output of the instruction

Return type:string
get_raw()

Return the object in a raw format

Return type:string
androguard.core.bytecodes.dvm.clean_name_instruction(instruction)
androguard.core.bytecodes.dvm.determineException(vm, m)

Returns try-catch handler inside the method.

Parameters:
Returns:

androguard.core.bytecodes.dvm.determineNext(i, end, m)
androguard.core.bytecodes.dvm.get_access_flags_string(value)

Transform an access flag field to the corresponding string

Parameters:value (int) – the value of the access flags
Return type:string
androguard.core.bytecodes.dvm.get_byte(buff)
androguard.core.bytecodes.dvm.get_bytecodes_method(dex_object, ana_object, method)
androguard.core.bytecodes.dvm.get_bytecodes_methodx(method, mx)
androguard.core.bytecodes.dvm.get_extented_instruction(cm, op_value, buff)
androguard.core.bytecodes.dvm.get_instruction(cm, op_value, buff, odex=False)
androguard.core.bytecodes.dvm.get_instruction_payload(op_value, buff)
androguard.core.bytecodes.dvm.get_kind(cm, kind, value)

Return the value of the ‘kind’ argument

Parameters:
  • cm (ClassManager) – a ClassManager object
  • kind (int) – the type of the ‘kind’ argument
  • value (int) – the value of the ‘kind’ argument
Return type:

string

androguard.core.bytecodes.dvm.get_optimized_instruction(cm, op_value, buff)
androguard.core.bytecodes.dvm.get_params_info(nb, proto)
androguard.core.bytecodes.dvm.get_sbyte(buff)
androguard.core.bytecodes.dvm.get_type(atype, size=None)

Retrieve the type of a descriptor (e.g : I)

androguard.core.bytecodes.dvm.read_null_terminated_string(f)

Read a null terminated string from a file-like object.

Parameters:f – file-like object
Return type:bytearray
androguard.core.bytecodes.dvm.readsleb128(buff)

Read a signed LEB128 at the current position of the buffer.

Parameters:buff – a file like object
Returns:decoded sLEB128
androguard.core.bytecodes.dvm.readuleb128(buff)

Read an unsigned LEB128 at the current position of the buffer

Parameters:buff – a file like object
Returns:decoded unsigned LEB128
androguard.core.bytecodes.dvm.readuleb128p1(buff)

Read an unsigned LEB128p1 at the current position of the buffer. This format is the same as uLEB128 but has the ability to store the value -1.

Parameters:buff – a file like object
Returns:decoded uLEB128p1
androguard.core.bytecodes.dvm.static_operand_instruction(instruction)
androguard.core.bytecodes.dvm.writesleb128(value)

Convert an integer value to the corresponding signed LEB128

Parameters:value – integer value
Returns:bytes
androguard.core.bytecodes.dvm.writeuleb128(value)

Convert an integer value to the corresponding unsigned LEB128.

Raises a value error, if the given value is negative.

Parameters:value – non-negative integer
Returns:bytes

androguard.core.bytecodes.axml module

class androguard.core.bytecodes.axml.ARSCComplex(buff, parent=None)

Bases: object

This is actually a ResTable_map_entry

It contains a set of {name: value} mappings, which are of type ResTable_map. A ResTable_map contains two items: ResTable_ref and Res_value.

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#1485 for ResTable_map_entry and http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#1498 for ResTable_map

class androguard.core.bytecodes.axml.ARSCHeader(buff, expected_type=None)

Bases: object

Object which contains a Resource Chunk. This is an implementation of the ResChunk_header.

It will throw an ResParserError if the header could not be read successfully.

It is not checked if the data is outside the buffer size nor if the current chunk fits into the parent chunk (if any)!

The parameter expected_type can be used to immediately check the header for the type or raise a ResParserError. This is useful if you know what type of chunk must follow.

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#196 :raises: ResParserError

SIZE = 8
end

Get the absolute offset inside the file, where the chunk ends. This is equal to ARSCHeader.start + ARSCHeader.size.

header_size

Size of the chunk header (in bytes). Adding this value to the address of the chunk allows you to find its associated data (if any).

size

Total size of this chunk (in bytes). This is the chunkSize plus the size of any data associated with the chunk. Adding this value to the chunk allows you to completely skip its contents (including any child chunks). If this value is the same as chunkSize, there is no data associated with the chunk.

type

Type identifier for this chunk

class androguard.core.bytecodes.axml.ARSCParser(raw_buff)

Bases: object

Parser for resource.arsc files

The ARSC File is, like the binary XML format, a chunk based format. Both formats are actually identical but use different chunks in order to store the data.

The most outer chunk in the ARSC file is a chunk of type RES_TABLE_TYPE. Inside this chunk is a StringPool and at least one package.

Each package is a chunk of type RES_TABLE_PACKAGE_TYPE. It contains again many more chunks.

class ResourceResolver(android_resources, config=None)

Bases: object

Resolves resources by ID and configuration. This resolver deals with complex resources as well as with references.

put_ate_value(result, ate, config)

Put a ResTableEntry into the list of results :param list result: results array :param ARSCResTableEntry ate: :param ARSCResTableConfig config: :return:

put_item_value(result, item, config, parent, complex_)

Put the tuple (ARSCResTableConfig, resolved string) into the result set

Parameters:
Returns:

resolve(res_id)

the given ID into the Resource and returns a list of matching resources.

Parameters:res_id (int) – numerical ID of the resource
Returns:a list of tuples of (ARSCResTableConfig, str)
get_bool_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘bool’.

Read more about bool resources: https://developer.android.com/guide/topics/resources/more-resources.html#Bool

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_color_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘color’.

Read more about color resources: https://developer.android.com/guide/topics/resources/more-resources.html#Color

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_dimen_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘dimen’.

Read more about Dimension resources: https://developer.android.com/guide/topics/resources/more-resources.html#Dimension

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_id(package_name, rid, locale='\x00\x00')

Returns the tuple (resource_type, resource_name, resource_id) for the given resource_id.

Parameters:
  • package_name – package name to query
  • rid – the resource_id
  • locale – specific locale
Returns:

tuple of (resource_type, resource_name, resource_id)

get_id_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘id’.

Read more about ID resources: https://developer.android.com/guide/topics/resources/more-resources.html#Id

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_integer_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘integer’.

Read more about integer resources: https://developer.android.com/guide/topics/resources/more-resources.html#Integer

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_items(package_name)
get_locales(package_name)

Retrieve a list of all available locales in a given packagename.

Parameters:package_name – the package name to get locales of
get_packages_names()

Retrieve a list of all package names, which are available in the given resources.arsc.

get_public_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘public’.

The public resources table contains the IDs for each item.

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_res_configs(rid, config=None, fallback=True)

Return the resources found with the ID rid and select the right one based on the configuration, or return all if no configuration was set.

But we try to be generous here and at least try to resolve something: This method uses a fallback to return at least one resource (the first one in the list) if more than one items are found and the default config is used and no default entry could be found.

This is usually a bad sign (i.e. the developer did not follow the android documentation: https://developer.android.com/guide/topics/resources/localization.html#failing2) In practise an app might just be designed to run on a single locale and thus only has those locales set.

You can disable this fallback behaviour, to just return exactly the given result.

Parameters:
  • rid – resource id as int
  • config – a config to resolve from, or None to get all results
  • fallback – Enable the fallback for resolving default configuration (default: True)
Returns:

a list of ARSCResTableConfig: ARSCResTableEntry

get_res_id_by_key(package_name, resource_type, key)
get_resolved_res_configs(rid, config=None)

Return a list of resolved resource IDs with their corresponding configuration. It has a similar return type as get_res_configs() but also handles complex entries and references. Also instead of returning ARSCResTableEntry in the tuple, the actual values are resolved.

This is the preferred way of resolving resource IDs to their resources.

Parameters:
  • rid (int) – the numerical ID of the resource
  • config (ARSCTableResConfig) – the desired configuration or None to retrieve all
Returns:

A list of tuples of (ARSCResTableConfig, str)

get_resolved_strings()
get_resource_bool(ate)
get_resource_color(ate)
get_resource_dimen(ate)
get_resource_id(ate)
get_resource_integer(ate)
get_resource_string(ate)
get_resource_style(ate)
get_resource_xml_name(r_id, package=None)

Returns the XML name for a resource, including the package name if package is None. A full name might look like @com.example:string/foobar Otherwise the name is only looked up in the specified package and is returned without the package name. The same example from about without the package name will read as @string/foobar.

If the ID could not be found, None is returned.

A description of the XML name can be found here: https://developer.android.com/guide/topics/resources/providing-resources#ResourcesFromXml

Parameters:
  • r_id – numerical ID if the resource
  • package – package name
Returns:

XML name identifier

get_string(package_name, name, locale='\x00\x00')
get_string_resources(package_name, locale='\x00\x00')

Get the XML (as string) of all resources of type ‘string’.

Read more about string resources: https://developer.android.com/guide/topics/resources/string-resource.html

Parameters:
  • package_name – the package name to get the resources for
  • locale – the locale to get the resources for (default: ‘’)
get_strings_resources()

Get the XML (as string) of all resources of type ‘string’. This is a combined variant, which has all locales and all package names stored.

get_type_configs(package_name, type_name=None)
get_types(package_name, locale='\x00\x00')

Retrieve a list of all types which are available in the given package and locale.

Parameters:
  • package_name – the package name to get types of
  • locale – the locale to get types of (default: ‘’)
static parse_id(name)

Resolves an id from a binary XML file in the form “@[package:]DEADBEEF” and returns a tuple of package name and resource id. If no package name was given, i.e. the ID has the form “@DEADBEEF”, the package name is set to None.

Raises a ValueError if the id is malformed.

Parameters:name – the string of the resource, as in the binary XML file
Returns:a tuple of (resource_id, package_name).
class androguard.core.bytecodes.axml.ARSCResStringPoolRef(buff, parent=None)

Bases: object

This is actually a Res_value It holds information about the stored resource value

See: http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#262

format_value()

Return the formatted (interpreted) data according to data_type.

get_data()
get_data_type()
get_data_type_string()
get_data_value()
is_reference()

Returns True if the Res_value is actually a reference to another resource

class androguard.core.bytecodes.axml.ARSCResTableConfig(buff=None, **kwargs)

Bases: object

ARSCResTableConfig contains the configuration for specific resource selection. This is used on the device to determine which resources should be loaded based on different properties of the device like locale or displaysize.

See the definition of ResTable_config in http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#911

classmethod default_config()
get_config_name_friendly()

Here for legacy reasons.

use get_qualifier() instead.

get_country()
get_density()
get_language()
get_language_and_region()

Returns the combined language+region string or for the default locale :return:

get_qualifier()

Return resource name qualifier for the current configuration. for example * ldpi-v4 * hdpi-v4

All possible qualifiers are listed in table 2 of https://developer.android.com/guide/topics/resources/providing-resources

..todo:: This name might not have all properties set! Therefore returned values might not reflect the true qualifier name! :return: str

is_default()

Test if this is a default resource, which matches all

This is indicated that all fields are zero. :return: True if default, False otherwise

class androguard.core.bytecodes.axml.ARSCResTableEntry(buff, mResId, parent=None)

Bases: object

A ResTable_entry.

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#1458

FLAG_COMPLEX = 1
FLAG_PUBLIC = 2
FLAG_WEAK = 4
get_index()
get_key_data()
get_value()
is_complex()
is_public()
is_weak()
class androguard.core.bytecodes.axml.ARSCResTablePackage(buff, header)

Bases: object

A ResTable_package

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#861

get_name()
class androguard.core.bytecodes.axml.ARSCResType(buff, parent=None)

Bases: object

This is a ResTable_type without it’s ResChunk_header. It contains a ResTable_config

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#1364

get_package_name()
get_type()
class androguard.core.bytecodes.axml.ARSCResTypeSpec(buff, parent=None)

Bases: object

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#1327

class androguard.core.bytecodes.axml.AXMLParser(raw_buff)

Bases: object

AXMLParser reads through all chunks in the AXML file and implements a state machine to return information about the current chunk, which can then be read by AXMLPrinter.

An AXML file is a file which contains multiple chunks of data, defined by the ResChunk_header. There is no real file magic but as the size of the first header is fixed and the type of the ResChunk_header is set to RES_XML_TYPE, a file will usually start with 0x03000800. But there are several examples where the type is set to something else, probably in order to fool parsers.

Typically the AXMLParser is used in a loop which terminates if m_event is set to END_DOCUMENT. You can use the next() function to get the next chunk. Note that not all chunk types are yielded from the iterator! Some chunks are processed in the AXMLParser only. The parser will set is_valid() to False if it parses something not valid. Messages what is wrong are logged.

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#563

comment

Return the comment at the current position or None if no comment is given

This works only for Tags, as the comments of Namespaces are silently dropped. Currently, there is no way of retrieving comments of namespaces.

getAttributeCount()

Return the number of Attributes for a Tag or -1 if not in a tag

getAttributeName(index)

Returns the String which represents the attribute name

getAttributeNamespace(index)

Return the Namespace URI (if any) for the attribute

getAttributeUri(index)

Returns the numeric ID for the namespace URI of an attribute

getAttributeValue(index)

This function is only used to look up strings All other work is done by format_value() # FIXME should unite those functions :param index: index of the attribute :return:

getAttributeValueData(index)

Return the data of the attribute at the given index

Parameters:index – index of the attribute
getAttributeValueType(index)

Return the type of the attribute at the given index

Parameters:index – index of the attribute
getName()

Legacy only! use name instead

getPrefix()

Legacy only! use namespace instead

getText()

Legacy only! use text instead

is_valid()

Get the state of the AXMLPrinter. if an error happend somewhere in the process of parsing the file, this flag is set to False.

name

Return the String assosciated with the tag name

namespace

Return the Namespace URI (if any) as a String for the current tag

nsmap

Returns the current namespace mapping as a dictionary

there are several problems with the map and we try to guess a few things here:

  1. a URI can be mapped by many prefixes, so it is to decide which one to take
  2. a prefix might map to an empty string (some packers)
  3. uri+prefix mappings might be included several times
  4. prefix might be empty
text

Return the String assosicated with the current text

class androguard.core.bytecodes.axml.AXMLPrinter(raw_buff)

Bases: object

Converter for AXML Files into a lxml ElementTree, which can easily be converted into XML.

A Reference Implementation can be found at http://androidxref.com/9.0.0_r3/xref/frameworks/base/tools/aapt/XMLNode.cpp

get_buff()

Returns the raw XML file without prettification applied.

Returns:bytes, encoded as UTF-8
get_xml(pretty=True)

Get the XML as an UTF-8 string

Returns:bytes encoded as UTF-8
get_xml_obj()

Get the XML as an ElementTree object

Returns:lxml.etree.Element
is_packed()

Returns True if the AXML is likely to be packed

Packers do some weird stuff and we try to detect it. Sometimes the files are not packed but simply broken or compiled with some broken version of a tool. Some file corruption might also be appear to be a packed file.

Returns:True if packer detected, False otherwise
is_valid()

Return the state of the AXMLParser. If this flag is set to False, the parsing has failed, thus the resulting XML will not work or will even be empty.

class androguard.core.bytecodes.axml.PackageContext(current_package, stringpool_main, mTableStrings, mKeyStrings)

Bases: object

get_mResId()
get_package_name()
set_mResId(mResId)
exception androguard.core.bytecodes.axml.ResParserError

Bases: Exception

Exception for the parsers

class androguard.core.bytecodes.axml.StringBlock(buff, header)

Bases: object

StringBlock is a CHUNK inside an AXML File: ResStringPool_header It contains all strings, which are used by referecing to ID’s

See http://androidxref.com/9.0.0_r3/xref/frameworks/base/libs/androidfw/include/androidfw/ResourceTypes.h#436

getString(idx)

Return the string at the index in the string table

Parameters:idx – index in the string table
Returns:str
getStyle(idx)

Return the style associated with the index

Parameters:idx – index of the style
Returns:
show()

Print some information on stdout about the string table

androguard.core.bytecodes.axml.complexToFloat(xcomplex)

Convert a complex unit into float

androguard.core.bytecodes.axml.format_value(_type, _data, lookup_string=<function <lambda>>)

Format a value based on type and data. By default, no strings are looked up and “<string>” is returned. You need to define lookup_string in order to actually lookup strings from the string table.

Parameters:
  • _type – The numeric type of the value
  • _data – The numeric data of the value
  • lookup_string – A function how to resolve strings from integer IDs
androguard.core.bytecodes.axml.get_arsc_info(arscobj)

Return a string containing all resources packages ordered by packagename, locale and type.

Parameters:arscobjARSCParser
Returns:a string

androguard.core.bytecodes.mutf8 module

class androguard.core.bytecodes.mutf8.PeekIterator(s)

Bases: object

A quick’n’dirty variant of an Iterator that has a special function peek, which will return the next object but not consume it.

idx = 0
next()
peek()
androguard.core.bytecodes.mutf8.chr(val)

Patched Version of builtins.chr, to work with narrow python builds In those versions, the function unichr does not work with inputs >0x10000

This seems to be a problem usually on older windows builds.

Parameters:val – integer value of character
Returns:character
androguard.core.bytecodes.mutf8.decode(b)

Decode bytes as MUTF-8 See https://docs.oracle.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8 for more information

Surrogates will be returned as two 16 bit characters.

Parameters:b – bytes to decode
Return type:unicode (py2), str (py3) of 16bit chars
Raises:UnicodeDecodeError if string is not decodable
androguard.core.bytecodes.mutf8.patch_string(s)

Reorganize a String in such a way that surrogates are printable and lonely surrogates are escaped.

Parameters:s – input string
Returns:string with escaped lonely surrogates and 32bit surrogates

Module contents