[07/01/24 11:12:47] WARNING Unable to load torch and dependent libraries from torch-snippets. loader.py:<module>:108
- Functionalities might be limited. pip install lovely-tensors in case there are torch related errors
-
Args: images_folder (str): Path to the folder containing the images. csvs_folder (str): Path to the folder containing the CSV annotations. xml_output_file (str): Path to the output XML file. items (list, optional): List of items to process. If None, all items will be processed. Defaults to None. parquet (bool, optional): Whether the annotations are stored in Parquet format. Defaults to False. relative_df (bool, optional): Whether the bounding box coordinates in the CSV are relative to the image size. Defaults to True. default_label (str, optional): Default label for the bounding boxes. Defaults to โBackgroundโ. extension (str, optional): Image file extension. Defaults to โjpgโ.
from sklearn.datasets import load_iris
-from sklearn.model_selection import train_test_split
-
-data = load_iris()
-X, y = data.data, data.target
-X_trn, X_val, y_trn, y_val = train_test_split(X, y, random_state=42)
-
-
โฆ and create the data loaders
-
-
from torch_snippets.markup2 import AD
-from torch.utils.data import TensorDataset
-
-trn_ds = TensorDataset(*[torch.Tensor(i) for i in [X_trn, y_trn]])
-trn_dl = DataLoader(trn_ds, batch_size=32)
-
-val_ds = TensorDataset(*[torch.Tensor(i) for i in [X_val, y_val]])
-val_dl = DataLoader(val_ds, batch_size=32)
-
-AD(next(iter(val_dl)))
Next weโll import Capsule and a few decorators that will tell the model to change itโs mode to train/test during the fit function
-
from torch_snippets.trainer.capsule import Capsule, train, validate, predict
-
Create the neural network and define itโs forward function as usual pytorch business. Only difference now is that youโll also add self.loss_fn and self.optimizer attributes in the init
To fully describe the modelโs behaviour we still need to define three functions
-1. train_batch
-2. validate_batch and,
-3. predict which is optional
Ensure you return dictionaries of losses, accuracy metrics in train_batch and validate_batch functions. You can return as many metrics during training and validation, they will be auto logged.
-
-
Also make sure at least one of the keys in train_batch is the key loss, as this is used to compute gradients.*
-
-
We could now create the modelโฆ
-
-
model = IrisModel()
-model.device ="cpu"
-
-
โฆ and run model.fit with an optional number of logs to print to the console
Refer to altair-viz.github.io for more awesome charts.
-torch-snippets exposes a confusion matrix function CM as an example
-
-
Method 1
-
-
n =10
-a ="qwertyuiopasdfghjklzxcvbnm"
-truth = np.random.randint(4, size=1000000)
-pred = np.random.randint(4, size=1000000)
-show(CM(truth=truth, pred=pred, mapping={i: a for i, a inenumerate(a)}))
-# mapping is optional
df = pd.DataFrame(
- {
-"truth": [randint(n) for _ inrange(1000)],
-"pred": [randint(n) for _ inrange(1000)],
- }
-)
-show(CM(df, "truth", "pred", mapping={i: a for i, a inenumerate(a)}))
-# mapping is optional
*Plot a spider chart based on the given dataframe.
-
Parameters: - df: pandas DataFrame The input dataframe containing the data to be plotted. - id_column: str, optional The column name to be used as the identifier for each data point. If not provided, the index of the dataframe will be used. - title: str, optional The title of the spider chart. - max_values: dict, optional A dictionary specifying the maximum values for each category. If not provided, the maximum values will be calculated based on the data. - padding: float, optional The padding factor to be applied when calculating the maximum values. Default is 1.25. - global_scale: bool or float, optional If False, each category will have its own maximum value. If True, a single maximum value will be used for all categories. If a float value is provided, it will be used as the maximum value for all categories. - ax: matplotlib Axes, optional The axes on which to plot the spider chart. If not provided, a new figure and axes will be created. - sz: float, optional The size of the figure (both width and height) in inches. Default is 10.
*This function generates Altair-based interactive UpSet plots.
-
Parameters: - data (pandas.DataFrame): Tabular data containing the membership of each element (row) in exclusive intersecting sets (column). - sets (list): List of set names of interest to show in the UpSet plots. This list reflects the order of sets to be shown in the plots as well. - abbre (list): Abbreviated set names. - sort_by (str): โfrequencyโ or โdegreeโ - sort_order (str): โascendingโ or โdescendingโ - width (int): Vertical size of the UpSet plot. - height (int): Horizontal size of the UpSet plot. - height_ratio (float): Ratio of height between upper and under views, ranges from 0 to 1. - horizontal_bar_chart_width (int): Width of horizontal bar chart on the bottom-right. - color_range (list): Color to encode sets. - highlight_color (str): Color to encode intersecting sets upon mouse hover. - glyph_size (int): Size of UpSet glyph (โฌค). - set_label_bg_size (int): Size of label background in the horizontal bar chart. - line_connection_size (int): width of lines in matrix view. - horizontal_bar_size (int): Height of bars in the horizontal bar chart. - vertical_bar_label_size (int): Font size of texts in the vertical bar chart on the top. - vertical_bar_padding (int): Gap between a pair of bars in the vertical bar charts.*
*Configure the top-level settings for an UpSet plot in Altair.
-
Parameters: - base: The base chart to configure. - legend_orient: The orientation of the legend. Default is โtop-leftโ. - legend_symbol_size: The size of the legend symbols. Default is 30.
This class provides methods to access and manipulate configuration settings.
-
Attributes: input_variables (list): List of input variables defined in the class constructor.
-
Methods: keys(): Returns the list of input variables. getitem(key): Returns the value of the specified key. contains(key): Checks if the specified key is present in the input variables. from_ini_file(filepath, config_root=None): Creates an instance of the class from an INI file. repr(): Returns a string representation of the class.
-
Example usage: config = DeepLearningConfig() config.from_ini_file(โconfig.iniโ) print(config.keys()) print(config[โlearning_rateโ])*
If needed, configs can be unpacked like a dictionary too
-
-
class MNIST(nn.Module):
-"""
- A PyTorch module for a multi-layer perceptron (MLP) model for MNIST classification.
-
- Args:
- n_hidden (int): The number of hidden units in each hidden layer.
- n_classes (int): The number of output classes.
- n_layers (int): The number of hidden layers in the model.
-
- Attributes:
- model (nn.Sequential): The sequential model that represents the MLP.
-
- """
-
-def__init__(self, *, n_hidden, n_classes, n_layers):
-super().__init__()
-self.model = nn.Sequential(
- nn.Linear(768, n_hidden),
-*[
- nn.Sequential(nn.Linear(n_hidden, n_hidden), nn.ReLU())
-for _ inrange(n_layers -1)
- ],
- nn.Linear(n_hidden, n_classes)
- )
-
-def forward(self, images): ...
-
-
-model = MNIST(**config.ModelConfig)
-print(model)
This class provides methods to access and manipulate configuration settings.
-
Attributes: input_variables (list): List of input variables defined in the class constructor.
-
Methods: keys(): Returns the list of input variables. getitem(key): Returns the value of the specified key. contains(key): Checks if the specified key is present in the input variables. from_ini_file(filepath, config_root=None): Creates an instance of the class from an INI file. repr(): Returns a string representation of the class.
-
Example usage: config = DeepLearningConfig() config.from_ini_file(โconfig.iniโ) print(config.keys()) print(config[โlearning_rateโ])*
-
GenericConfig is a special class that can have attributes solely based on the config file, i.e., when we are unsure what are the arguments in the config going to be
*A decorator that checks if any keyword argument is None. Raises a ValueError if any argument is None.
-
Args: func: The function to be decorated.
-
Returns: The decorated function.
-
Raises: ValueError: If any keyword argument is None.*
-
-
-
-
io
-
-
io (func)
-
-
*A decorator that inspects the inputs and outputs of a function.
-
Args: func: The function to be decorated.
-
Returns: The decorated function.*
-
-
-
-
timeit
-
-
timeit (func)
-
-
*A decorator that measures the execution time of a function.
-
Args: func (callable): The function to be timed.
-
Returns: callable: The wrapped function.
-
Example: @timeit def my_function(): # code to be timed pass
-
my_function() # prints the execution time of my_function*
-
-
-
-
warn_on_fail
-
-
warn_on_fail (func)
-
-
-
-
-
format
-
-
format (input)
-
-
-
@timeit
-@io
-def foo(a, b):
-"""
- This function takes two arguments, `a` and `b`, and returns their sum.
-
- Parameters:
- a (int): The first number.
- b (int): The second number.
-
- Returns:
- int: The sum of `a` and `b`.
- """
-import time
-
- time.sleep(1)
-return a + b
-
-
-foo(10, 11)
-
-
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-INPUTS:ARGS:
-tuple of 2 items
- int: 10
- int: 11
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-OUTPUTS:
-int: 21
-โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
-[08/24/24 12:50:36] INFO foo took 1.00 seconds to execute d=152002;file://<ipython-input-1-6ac2073623b5>:47\<ipython-input-1-6ac2073623b5>;;\:d=283174;file://<ipython-input-1-6ac2073623b5>:47#wrapper:47\wrapper:47;;\
-
-
-
21
-
-
-
-
@check_kwargs_not_none
-@io
-def foo(*, a=None, b=None):
-return a + b
-
-
-foo(a=None, b=10)
-
-
-
---------------------------------------------------------------------------
-ValueError Traceback (most recent call last)
-Cell In[34], line 6
- 1 @check_kwargs_not_none
- 2 @io
- 3 def foo(*, a=None, b=None):
- 4 return a + b
-----> 6 foo(a=None, b=10)
-
-Cell In[32], line 32, in check_kwargs_not_none.<locals>.wrapper(*args, **kwargs)
- 30 for key, value in kwargs.items():
- 31 if value is None:
----> 32 raise ValueError(f"Input argument '{key}' cannot be None")
- 33 return func(*args, **kwargs)
-
-ValueError: Input argument 'a' cannot be None
All functions will work with data frames that contain either of absolute/relative coordinates, and will preserve the image type (np.ndarray or PIL.Image.Image) too
CPU times: user 263 ms, sys: 53.9 ms, total: 317 ms
-Wall time: 407 ms
-
-
-
Below we are trying to extract the __all__ list from all Python files of the torch_snippets directory.
-Through the code, you can already see some of the elements of torch-snippets in action.
-
-
import ast
-
-os.environ[
-"AD_MAX_ITEMS"
-] = ( # os is already imported by torch_snippets, along with many other useful libraries
-"1000"# Set the maximum number of items to display in the AD object
-)
-
-
-@tryy# This is a decorator that catches exceptions
-def extract_all_list(file_path):
-file= readfile(file_path, silent=True) # Read the file
- tree = ast.parse(file, filename=file_path)
-
-for node in tree.body:
-ifisinstance(node, ast.Assign):
-for target in node.targets:
-ifisinstance(target, ast.Name) and target.id=="__all__":
-ifisinstance(node.value, ast.List):
- all_list = [
- elt.value
-for elt in node.value.elts
-ifisinstance(elt, ast.Constant)
- ]
-return all_list
-returnNone
-
-
-def print_all_lists_in_directory(directory):
-dir= P(directory) # Create a pathlib.Path object
-for f indir.ls(): # Iterate over all files in the directory
-if f.extn =="py"and f.stem notin [
-"__init__",
-"_nbdev",
- ]: # If it's a Python file and not __init__.py
- all_list = extract_all_list(f)
-if all_list isnotNoneandlen(all_list) >0:
- h2(f.stem) # Print the name of the file as a heading in jupyter
-print(
- AD({"items": all_list})
- ) # AD is an intelligent dictionary that can display itself nicely
-
-
-
print(P().resolve())
-
-
/Users/yeshwanth/Code/Personal/torch_snippets/nbs
-
-
-
-
# Specify the directory containing the Python files
-directory_path ="../torch_snippets"
-print_all_lists_in_directory(directory_path)
def do2():
- Info('Log will still say it is from `do2` now, to the right of log print')
-
-def do():
- do2()
-
-do()
-
-
[08/24/24 12:17:13] INFO Log will still say it is from 2099333876.py:do2:2
- `do2` now, to the right of log
- print
-
-
-
-
-
def do2():
- Info('But now, log will still say it is from do', depth=1)
-
-def do():
- do2()
-
-do()
-
-
[08/24/24 12:17:25] INFO But now, log will still say it 4054019042.py:do:5
- is from do
-
-
-
-
-
-
Logging Level and context
-
-
-
in_logger_mode
-
-
in_logger_mode (level:str)
-
-
returnโs T/F, checking if logger is in a specific mode or not
-
-
-
-
logger_mode
-
-
logger_mode (level)
-
-
temporarily, using with context, set the level to something else
-
-
-
-
get_logger_level
-
-
get_logger_level ()
-
-
get the current loggerโs level
-
Letโs log every level in the do function below. We can control what we need to log from outside the functionโs context by
-using with <level>_model():
The in_<level>_mode gives an additional layer of control, to be used for debugging dynamically. Letโs say, you want to show an image (for the sake of debugging)
But now you are happy with your code and donโt want the show, say the code is going to production. A common way out is to just comment that line
-
-
def do(im_path):
-from torch_snippets import show, read
- im = read(im_path)
-# show(im, sz=3) # line is commented, but will need to be re-uncommented any time it needs debugging
-print(im.mean())
-
-do('assets/Preamble.png')
-
-
145.5542982442515
-
-
-
But if you want to re-check, itโs a pain to again uncomment. Not to mention this method is not scalable to 100s of lines of code. The simple way to deal with such transient code that needs to activate only when you want it to, is to enclose in an if in_<level>_mode conditional like so
*Utility class to interact with a dictionary as if it were an object. AD is an alias to this class
-
FEATURES: 0. Access and modify keys (including nested keys) as if they were object attributes, supporting tab-completion. Example: self.key1.key2[0].key3 1. Keys and values are recursively converted to AttrDict instances. 2. Pretty-print the dictionary using print. 3. Convert the entire structure to a regular dictionary at any time using self.to_dict() / self.dict(). 3. Recursively remove keys using self.drop(key) from a JSON object. 4. Apply a function to all values at all levels using map.
-
GOTCHAS: 1. All integer keys are implicitly converted to strings due to the enforced self.key format. 2. You can still use self[int], but this internally converts the integer to a string.
-
METHODS: - items(): Return the items of the AttrDict as key-value pairs. - keys(): Return the keys of the AttrDict. - values(): Return the values of the AttrDict. - update(dict): Update the AttrDict with key-value pairs from another dictionary. - get(key, default=None): Get the value associated with a key, with an optional default value. - __iter__(): Allow iteration over the keys of the AttrDict. - __len__(): Return the number of keys in the AttrDict. - __repr__(): Return a string representation of the AttrDict. - __dir__(): List the keys of the AttrDict as attributes. - __contains__(key): Check if a key exists in the AttrDict, use โa.b.cโ notation to directly check for a nested attribute. - __delitem__(key): Delete a key from the AttrDict. - map(func): Apply a function to all values in the AttrDict. - drop(key): Recursively remove a key and its values from the AttrDict. - to_dict(): Convert the AttrDict and its nested structure to a regular dictionary. - pretty(print_with_logger=False, *args, **kwargs): Pretty-print the AttrDict as JSON. - __eq__(other): Compare the AttrDict with another dictionary for equality. - find_address(key, current_path=""): Find and return all addresses (paths) of a given key in the AttrDict. - summary(current_path='', summary_str='', depth=0, sep=' '): Generate a summary of the structure and values in the AttrDict. - write_summary(to, **kwargs): Write the summary to a file or stream. - fetch(addr): Retrieve a value at a specified address (path).
-
PARAMETERS: - data (dict, optional): Initial data to populate the AttrDict.
-
USAGE: - Create an AttrDict instance by providing an optional initial dictionary, and then access and manipulate its contents as if they were object attributes.
Use timer as a standalone class so you have full control on when to call a lap (most useful in while loops)โฆ
-
-
N =50
-t = Timer(N)
-info =None
-
-for i inrange(N):
- time.sleep(0.1)
- t(info=info) # Lap and present the time
-if i == N//2:
-print()
- info =f"My Info: {i*3.122}"
N =50
-info =None
-
-for i in (tracker := track2(range(N), total=N)):
- time.sleep(0.1)
- info =f"My Info: {i*3.122:.2f}"
-if i == N //2:
-print()
-if i >= N //2:
- tracker.send(info)
Warning! NEVER RUN tracker.send(None) as this will skip variables silently
-
-
# https://youtu.be/XcU-a-eksMA
-from IPython.display import YouTubeVideo
-
-# Replace 'video_id' with the actual YouTube video ID
-YouTubeVideo('XcU-a-eksMA', width=560, height=315)
-
-
-
-
-
-
-
-
-
Print execution time
-
@timeit decorates and prints time taken to execute a function as an info log
-
-
@timeit
-def foo(a, b=None):
-if b isNone:
-return a +1
-else:
- time.sleep(2)
-return a + b
Thereโs onlly one usecase where you would want to send in a list by yourself - when you want to append your errors to an existing list. The sensible default is to always store the errors, especially because this is a debugging tool.
Finally, you want to run the function (without try) to reproduce the error and actually start debugging. Just use the .F attribute to access the original function that you created
-
-
ix =2
-data = do.error_store[ix]
-do.F(*data.args, **data.kwargs)
-
-
-
---------------------------------------------------------------------------
-ZeroDivisionError Traceback (most recent call last)
-Cell In[19], line 3
- 1 ix =2
- 2 data = do.error_store[ix]
-----> 3do.F(*data.args,**data.kwargs)
-
-Cell In[17], line 9, in do(a, b, c)
- 6@tryy(silence_errors=True)
- 7defdo(a, b, c):
- 8if c <50:
-----> 9return1/0
- 10else:
- 11raiseNotImplementedError("๐ค")
-
-ZeroDivisionError: division by zero
-
-
-
-
-
def deco(decorator):
-@wraps(decorator)
-def wrapper(*args, **kwargs):
-def real_decorator(func):
-@wraps(func)
-def inner_wrapper(*fargs, **fkwargs):
-return decorator(func, *fargs, **fkwargs)
-
-return inner_wrapper
-
-iflen(args) ==1andcallable(args[0]) andnot kwargs:
-# Case when B is used without arguments
-return real_decorator(args[0])
-else:
-# Case when B is used with arguments
-def custom_decorator(func):
-return decorator(func, **kwargs)
-
-return custom_decorator
-
-return wrapper
100%|โโโโโโโโโโ| 2/2 [00:00<00:00, 649.78it/s]
-/var/folders/1_/71dqv9vx2750gmyz77q_f45w0000gn/T/ipykernel_40147/3826617683.py:27: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
- f.extractall(dest)
Notice, how the ${} variables got resolved.
-Not just that, the varaible z got computed on the fly.
-Not just that, some of the variables like list and dict got resolved into their respective python data structures.
from torch_snippets import*
-
-im = np.random.rand(100, 100)
-show(im)
-
-
-
-
-
-
-
-
show(im, sz=4)
-
-
-
-
-
-
-
Show will even accept pytorch Tensors and show them as images, even if they are on GPU and have channels first
-
It can accept bounding boxes as tuples of (x,y,X,Y) which can be integers (i.e., absolute coordinates) or fractions (between \([0,1]\)). Thereโs provision to give bb_colors and texts as well
You can find 1. train_test_split which also resets the dataframesโ indexes 2. MakeFrame 3. ImputeMisingValues 4. Cat2Num 5. Other scikit-lego blocks that I use a lot
-
-
-
MakeFrame
-
-
MakeFrame (column_names)
-
-
Convert sklearnโs output to a pandas dataframe Especially useful when working with an ensemble of models
-
Usage
-
Call MakeFrame as the last component in your pipeline with the desired column names.
ImputeMissingValues (num_mode=<function mean at 0x10695b6f0>,
- cat_mode='MISSING')
-
-
*DataFrame input - DataFrame output During fit - 1. Store imputable value for each column During transform - 2. Impute missing values with imputable value 3. Create a โ{col}_naโ boolean column to tell if cells contained missing value*
-
-
-
-
LambdaTransformer
-
-
LambdaTransformer (fn)
-
-
*Base class for all estimators in scikit-learn.
-
Inheriting from this class provides default implementations of:
-
-
setting and getting parameters used by GridSearchCV and friends;
-
textual and HTML representation displayed in terminals and IDEs;
-
estimator serialization;
-
parameters validation;
-
data validation;
-
feature names validation.
-
-
Read more in the :ref:User Guide <rolling_your_own_estimator>.*
-
-
-
-
MakeFrame
-
-
MakeFrame (column_names)
-
-
*Base class for all estimators in scikit-learn.
-
Inheriting from this class provides default implementations of:
-
-
setting and getting parameters used by GridSearchCV and friends;
-
textual and HTML representation displayed in terminals and IDEs;
-
estimator serialization;
-
parameters validation;
-
data validation;
-
feature names validation.
-
-
Read more in the :ref:User Guide <rolling_your_own_estimator>.*
-
-
-
-
Cat2Num
-
-
Cat2Num ()
-
-
*Base class for all estimators in scikit-learn.
-
Inheriting from this class provides default implementations of:
-
-
setting and getting parameters used by GridSearchCV and friends;
-
textual and HTML representation displayed in terminals and IDEs;
-
estimator serialization;
-
parameters validation;
-
data validation;
-
feature names validation.
-
-
Read more in the :ref:User Guide <rolling_your_own_estimator>.*
Args: images_folder (str): Path to the folder containing the images. csvs_folder (str): Path to the folder containing the CSV annotations. xml_output_file (str): Path to the output XML file. items (list, optional): List of items to process. If None, all items will be processed. Defaults to None. parquet (bool, optional): Whether the annotations are stored in Parquet format. Defaults to False. relative_df (bool, optional): Whether the bounding box coordinates in the CSV are relative to the image size. Defaults to True. default_label (str, optional): Default label for the bounding boxes. Defaults to โBackgroundโ. extension (str, optional): Image file extension. Defaults to โjpgโ.
[08/23/24 18:12:57] WARNING Unable to load torch and dependent libraries from torch-snippets. d=185334;file:///Users/yeshwanth/Code/Personal/torch_snippets/torch_snippets/loader.py:108\loader.py;;\:d=169396;file:///Users/yeshwanth/Code/Personal/torch_snippets/torch_snippets/loader.py:108#<module>:108\<module>:108;;\
- Functionalities might be limited. pip install lovely-tensors in case there are torch related errors
X_trn, X_val, y_trn, y_val = train_test_split(X, y, random_state=42)
โฆ and create the data loaders
-
-
from torch_snippets.inspector import inspect
+
+
from torch_snippets.markup2 import ADfrom torch.utils.data import TensorDatasettrn_ds = TensorDataset(*[torch.Tensor(i) for i in [X_trn, y_trn]])
@@ -333,45 +333,15 @@
Capsule (Tutorial)
val_ds = TensorDataset(*[torch.Tensor(i) for i in [X_val, y_val]])val_dl = DataLoader(val_ds, batch_size=32)
-inspect(next(iter(val_dl)))
Letโs log every level in the do function below. We can control what we need to log from outside the functionโs context by
+using with <level>_model():
The in_<level>_mode gives an additional layer of control, to be used for debugging dynamically. Letโs say, you want to show an image (for the sake of debugging)
But now you are happy with your code and donโt want the show, say the code is going to production. A common way out is to just comment that line
+
+
def do(im_path):
+from torch_snippets import show, read
+ im = read(im_path)
+# show(im, sz=3) # line is commented, but will need to be re-uncommented any time it needs debugging
+print(im.mean())
+
+do('assets/Preamble.png')
+
+
145.5542982442515
+
+
+
But if you want to re-check, itโs a pain to again uncomment. Not to mention this method is not scalable to 100s of lines of code. The simple way to deal with such transient code that needs to activate only when you want it to, is to enclose in an if in_<level>_mode conditional like so