Skip to content

Latest commit

 

History

History
213 lines (151 loc) · 7.43 KB

design-notes.org

File metadata and controls

213 lines (151 loc) · 7.43 KB

Design Notes

Previous design

!theme plain

class Context {
        + msgs: Lazy<list[int]>
        + score: Lazy<AnonimityScorer>
        + distribution: Lazy<Distribution>
        + people: int
        + k: int

        + abstract choice(msg_list) -> Choice
}

class UniformContext {
        + choice(msg_list) -> UniformSim
        + distribution: UniformDistribution

}

class ZipfContext {
        + distribution: Zipf
}

class UniformZipfContext {
        + choice: UniformSim
}

class PreferentialContext {
        + initial_weight: int
        + weight: int

        + choice: PASim
}

class Choice{
        + ring_order: int
        + people: int
        + message_list: list[int]

        + abstract apply(): Random<list[list[int]]>


}


class PreferentialAttachmentSim {
        + message_weight: int
        + initial_weight: int
}

Choice <|-- PreferentialAttachmentSim

Context <|-- UniformContext
Context <|-- ZipfContext
ZipfContext <|-- UniformZipfContext
ZipfContext <|-- PreferentialContext

class Simulation {
        + context: Context
        + msg_list: list[??]
        + simulate(seed=None) -> choice
}

uml.png

Problems with the design

Most of the classes are left unusued, and use lazyly loaded properties that need to be changed as an extension of the base class without adding any differential value.

You only want to use lazy properties when:

  • It takes time to load them.
  • They are not always used, and can be computed by the other properties (even if it takes little time)

And both of these reasons don’t appy (it’s O(1) creating them, and they are always used).

At the end of the day, all the contexts can be shown as different costructors of contexts,

Changesin the design

At the end of the day, the interface should be painless to hack around, because that what simulations are used for. When you have to create a flipping heriearchy of classes just to extend a method, then it’s never worth it.

Simplify Context, Simulation

Simulation only adds one method to context, without providing any valuable thing. It can either be made as an inside method for context, or changed to an external function that takes context. I will change it to the later, as it seems that it will be easier to provide.

For Context, all the design will be changed so that it only takes Context class, and use the all the things as variables (because they are exactly that).

Distributioner, Message Passer

It’s literally one method this two interfaces, and it does not make any sense for them to be anything else.

Like, to the point that both create just a list of messages, the MessagePasser is to make creating them lazy, and the Distributioner is to create lazyly the MessagePasser. Like, why??

Choice

I don’t like Choice:

  • It repeats information that we already have in context
  • It only has one method, meaning that it’s just a currification of the function.

That why it will be changed from an abstract class to a well defined Function, which will make it easier to mock and use around.

New design

The idea of a Simulation

Parameters: Message list,

  1. Creates message list using a distribution: Distribution method (variables may change for this)
  2. Creates list of list of identifications, as this one will make the scoring features: Choice method (List[Id] -> Random[List[List[Id]]])
  3. Score the function as a partially scorer: Scorer functions (??)
  4. Encapsulates the result on a Simulation
  5. Scores the Simulation to extract the statistics

Distribution

Distributions are functions that given a context, they generate a list of messages, which are list of integers that represent the identification of the user.

There are two main distributions:

  • Uniform distribution
  • Zipf distribution, which needs a partial application, as there is a max_msg and s parameters that need to be set in order to generate the message.

Context

A context has the parameters that are unchangeble, and unrandomly of a configuration, given the number of people, the ring order, which is the number of people that encrypt the message; and finally the distribution method.

Technically it bootstraps the distribution with a double dispatch to generate and cache the list of messages to be send.

Choices / simulators

The simulate the underlying protocol.

The protocol is a function that changes a list of messages to a list of encrypted/signed messages, which are represented as a list of messages of all the people that have encrypted them. Note that most of them have the last position or the first position with the user that actually send the message. This is not an inconvenient, as it’s only used for analysis.

Then a simulator is something that generates from a context a list of encrypted messages. The context is to better encapsulate the information that is needed for the simulation, as it is pretty important. This lets create scripts more easily than having the lockdown of classes.

Simulation DTO

A simulation is a DTO that has the result of the simulation, that is, the list of encrypted messages, the list of messages send in the order that it did. and the Context of the simulation. This lets to create the statistics using this object as a function or a class.

Analyzer or Reviewer

The reviewer is a class that performs the statistical analysis of a Simulation with cached properties and partial results. It should be faster, but there are no test on how much faster it is.

This is a contrast between the 6 classes before, which will make reviewing easier.

It’s planned to create a way to generate a table row with either csv or org, so that the writing of the article becomes easier. Generating a csv is better for latex tables. If it’s not big enough, then org should be a little bit faster.

Graphics

It was left unused. Probably it’s just better to put the requirements in the scripts and delete the whole package.

UML

!theme plain

class Distribution {
  + __call__(context: Context) -> list[int]
}

class Context {
  + people: int
  + ring_order: int
  + distribution: Context -> list[int]
  + msg_list: list[int]
}

class Simulation {
  + msg_list: list[int]
  + seed: int | None
  + context: Context
  + signature: list[list[int]]
}

class Reviewer {
  + simulation: Simulation
  + scorer: Simulation -> list[float] = anonymity_score
  --
  + scores: ordered_list[(pos, score)]
  + anonymity: people without anonymity
  + medium_desviation: float
  + mean: float
  + median: float
  + min: float
  + max: float
}

Distribution -- Context
Context -- Simulation
Simulation -- Reviewer

Finishing Clearing up

Write documentation on current design.

Everything should have its own documentation, I have become painfully aware of that.

Rewrite all tests with a simulation in mind

The refactor has been major, so everything should be tested again.

Let’s try to have a good coverage before proceeding to study the protocols.

PROJ More simulations

Change in user behaviour

What happens if a superuser goes to lurking for a while?

I should make some simulation to prove the different methodologies.

New encryption method

Get the last n messages to see how it adapts. It should prevent snowballing with more messages, and it would be quicker to adapt when user behaviour patterns change.

Moving average as encryption

Statistical analysis

Can I get who is the superusers accuretely.