Skip to content

Latest commit

 

History

History
217 lines (152 loc) · 7.73 KB

README.md

File metadata and controls

217 lines (152 loc) · 7.73 KB

Linq.py

Build Status License codecov Coverage Status PyPI version

Install Typed-Linq

pip install -U linq-t
  • Typed-Linq = Linq + Static Checking(PyCharm performs best)
  • Linq = Auto-Completion + Readability
  • Static Checking = Safety + Debugging when coding

Additionally, static checking helps to type inference which improves the auto-completion.

Here is an example to get top 10 frequent pixels in a picture.

from linq import Flow
import numpy as np

def most_frequent(arr: np.ndarray) -> np.ndarray:
    return  Flow(arr.flatten())                        \
                    .group_by(None)                    \
                    .map(lambda k, v: (k, len(v)))     \
                    .sorted(by=lambda k, count: -count)\
                    .take(10)                          \
                    .map(lambda k, v: k)               \
                    .to_list()                         \
                    .then(np.array)
                    ._  # unbox

About Linq

The well-known EDSL in .NET, Language Integrated Query, in my opinion, is one of the best design in .NET environment.
Here is an example of C# Linq.

// Calculate MSE loss.
/// <param name="Prediction"> the prediction of the neuron network</param>
/// <param name="Expected"> the expected target of the neuron network</param>

Prediction.Zip(Expected, (pred, expected)=> Math.Square(pred-expected)).Average()

It's so human readable and it doesn't cost much.

And there are so many scenes very awkward to Python programmer, using Linq might help a lot.

Awkward Scenes in Python

seq1 = range(100)
seq2 = range(100, 200)
zipped = zip(seq1, seq2)
mapped = map(lambda ab: ab[0] / ab[1], zipped)
grouped = dict()
group_fn = lambda x: x // 0.2
for e in mapped:
    group_id = group_fn(e)
    if group_id not in grouped:
        grouped[group_id] = [e]
        continue
    grouped[group_id].append(e)
for e in grouped.items():
    print(e)

The codes seems to be too long...

Now we extract the function group_by:

def group_by(f, container):
    grouped = dict()
    for e in container:
        group_id = f(e)
        if group_id not in grouped:
            grouped[group_id] = [e]
            continue
        grouped[group_id].append(e)
    return grouped
res = group_by(lambda x: x//0.2, map(lambda ab[0]/ab[1], zip(seq1, seq2)))

Okay, it's not at fault, however, it makes me upset —— why do I have to write these ugly codes?

Now, let us try Linq!

from linq import Flow, extension_std
seq = Flow(range(100))
res = seq.zip(range(100, 200)).map(lambda fst, snd : fst/snd).group_by(lambda num: num//0.2)._

How does Linq.py work?

There is a core class object, linq.core.flow.TSource, which just has one member _.
When you want to get a specific extension method from TSource object, the type of its _ member will be used to search whether the extension method exists.
In other words, extension methods are binded with the type of _.

class TSource:
    __slots__ = ['_']

    def __init__(self, sequence):
        self._ = sequence

    def __getattr__(self, k):
        for cls in self._.__class__.__mro__:
            namespace = Extension.get(cls, '')
            if k in namespace:
                return partial(namespace[k], self)

        where = ','.join('{}.{}'.format(cls.__module__, cls.__name__) for cls in self._.__class__.__mro__)

        raise NameError("No extension method named `{}` for types `{}`.".format(k, where))

    def __str__(self):
        return self._.__str__()

    def __repr__(self):
        return self._.__repr__()


class Flow(Generic[T]):
    def __new__(cls, seq):
        return TSource(seq)

Extension Method

Here are two methods for you to do so.

  • you can use extension_std to add extension methods for all Flow objects.

  • you use extension_class(cls) to add extension methods for all Flow objects whose member _'s type is cls.

@extension_std  # For all Flow objects
def Add(self, i):
    return self + i

@extension_class(int) # Just for type `int`
def Add(self: int, i):
    return self + i

assert Flow(4).add(2)._ is 6

Documents of Standard Extension Methods

Note: Docs haven't been finished yet.

How to Contribute

Feel free to pull requests here.