# IRIS Open Source Datasets ## Top Level Directory Naming The general format for directory names is `--` Within `variant` the naming convention is still in flux. At the moment, the following variants exist: * `-` - where X is the rotation in degrees of the light source, and Y is the version of the raw data set * `gds` - contains various views of the design source for the chip in GDS format Note that the version numbering is a simple incrementing counter. The system is not yet stable enough to have a clear notion of what each version change entails, but examples of things that are changing version-to-version include: * Autofocus algorithm * Camera type * Resolution of tile * Stage levelling algorithm * Naming convention of files ## Sub Directory Contents Within each non-`gds` directory, you will find the following: * A stitched image saved without blending (filename contains `fast`) * A stitched image with blending * A JSON file with the stitching trace at the point where the images were saved * A `raw` directory that contains the raw source images that feed stitching, along with various intermediate stitching database artifacts. (**this is the input for the [stitcher](https://github.com/bunnie/iris-stitcher) script**) * A `subdesigns` directory that contains manually extracted regions of interest that correspond to GDS sub-designs (**this is the starting point for the [automated cell extraction](https://github.com/bunnie/iris-layout/blob/main/gds_to_png.py) routines**) There may also be a `.psb` file which is a Photoshop format file that does a manually aligned sanity check of the GDS-vs-layout. It's primarily for debugging purposes and highlights where the released GDS files might deviate from the actual fabricated design data. ### Raw Image Filename Format Each raw image name consists of an underscore-separated list of attributes. The ground truth on the attributes can be found in the imaging software, namely in [in this class definition](https://github.com/bunnie/jubiris/blob/211272c0155bf015af99b365551f2397b8f62211/midi-ctrl/miduet.py#L142-L163) and [in this output formatter](https://github.com/bunnie/jubiris/blob/211272c0155bf015af99b365551f2397b8f62211/midi-ctrl/miduet.py#L207-L212). Note that github permalinks are used for these references; check with the latest `main` branch for any updates to the naming convention. ## Example Extraction `extracted-example` contains artifacts that are generated by `gds_to_png.py`, along with some Python-pickled datasets that can be read in and fed into pytorch. These are provided to help sanity check initial runs of the script. Fully extracted datasets can also be provided, but the script is still in development so it's recommended to run it from source for now. To help with sanity checking/debugging, iris-layout-export.tgz contains a snapshot of a working directory with all the artifacts in place. ## Overview Flow See [the imaging flow diagram](https://bunnie.org/iris/datasets/imaging-flow.png) for a graphical representation of all the assets in this directory, how they are derived, and how they are related to each other. ![imaging flow diagram](https://bunnie.org/iris/datasets/imaging-flow.png)