Skip to content

Succinct Data Structure Representation of _type #1

Open
@edefazio

Description

@edefazio
Collaborator

This is a long-term goal---
A tool to take existing _types and _members, and convert them into a succinct data structure
or get a succinct data structure and turn it into a _type or _member.

the bidirectional nature succinct data structure to represent _types (_class, _enum, _interface, _annotation)
AND underlying _members (_initBlock, _method, _constructor, etc.)

  • is READ-ONLY
  • NO code formatting
  • can be iterated over (just like a _class.forMethods(m->... )) NOT MUTATED
  • can be walked into (like `Walk.listIn(_sc, Expression.class, e-> out.print(e)) )
  • can return "realized" i.e. return objects (_class, _method,_field) at any nest level

Internally I imagine it'll be similar to bytecode with bytes representing opcodes
and linking to names of things in a Lookup table

the purpose of this, is to make looking through code more memory efficient
(i.e. I should be able to take TONS of code like the source code of Linix) and
query it easily.

Looking through ALL code in a project should be fast & memory efficient
(we'll have probably MULTIPLE INDEXES outside of these types that provide information about the Class internals to speed up queries (i.e. feature hashing and or bloom filters ) and internally
we'll be able to load and sequentially walk the data structure performing analysis and transformations

more info on succinct data structures.
Succinct Data Structure
Feature Hashing
Bloom Filter

Activity

edefazio

edefazio commented on Mar 11, 2020

@edefazio
CollaboratorAuthor

Generally speaking, I should be able to achieve this by just using the existing infrastructure (for JavaParser/jdraft) to walk and create a serialized form.

Also, I should consider "fully qualifying everything without imports" i.e. directly scoping all static method calls and news and static field accesses as to have less ambiguity and making the code more easily usable so (we dont need to use the Java Symbol Solver, but rather just store the relationship directly in the AST via scope:

IF we have the classes available... it'd be nice to just use something like ClassGraph to build the CallGraph, so we wouldnt have to manually resolve the symbols or use the JavaSymbolSolver

https://github.com/classgraph/classgraph/wiki

i.e. before:

String s = "Hello"
Url url = new Ulr();
out.println("hey");

after:

java.lang.String s = "Hello"
java.net.Url url = new java.net.Url();
System.out.println("hey");

Here are some more (related) ideas about storage/querying/indexing (Finite State Automata/Bitap):
https://pvk.ca/Blog/2013/06/23/bitsets-match-regular-expressions/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @edefazio

        Issue actions

          Succinct Data Structure Representation of _type · Issue #1 · org-jdraft/jdraft