Skip to content

Different similiarity outputs between libraries #1

@WJDigby

Description

@WJDigby

Hello,

Thank you for providing this code.

This library outputs different "similarity" ratings when comparing two hashes than other ssdeep libraries / examples:

Python3 ssdeep library and the same Eicar strings used in the readme:

>>> e1 = ssdeep.hash("X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*")
>>> e2 = ssdeep.hash("X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-THREATPINCH-ANTIVIRUS-TEST-FILE!$H+H*")
>>> e1
'3:a+JraNvsgzsVqSwHq9:tJuOgzsko'
>>> e2
'3:a+JraNvsg7QhyqzWwHq9:tJuOg7Q4Wo'
>>> ssdeep.compare(e1, e2)
18

JavaScript ssdeep.js library:

>> e1 = ssdeep.digest("X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*")
"3:a+JraNvsgzsVqSwHq9:tJuOgzsko"
>> e2 = ssdeep.digest("X5O!P%@AP[4\\PZX54(P^)7CC)7}$EICAR-THREATPINCH-ANTIVIRUS-TEST-FILE!$H+H*")
"3:a+JraNvsg7QhyqzWwHq9:tJuOg7Q4Wo"
>> ssdeep.similarity(e1, e2)
70

Both libraries produce identical hashes.

The ssdeep online demo also produces a value of 18 when comparing the two Eicar strings:

image

​Is this intended behavior? Is there a "weight" or some metric that can adjust the grading scale of the comparison?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions