[1]:
!pip install parquetdb
!pip install ipykernel
Requirement already satisfied: matgraphdb in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (0.0.3)
Requirement already satisfied: pytest in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (8.3.4)
Requirement already satisfied: setuptools in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (75.1.0)
Requirement already satisfied: setuptools_scm in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (8.1.0)
Requirement already satisfied: python-dotenv in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (1.0.1)
Requirement already satisfied: numpy in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (1.26.4)
Requirement already satisfied: pandas in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (2.2.3)
Requirement already satisfied: scipy in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (1.13.1)
Requirement already satisfied: matplotlib in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (3.9.4)
Requirement already satisfied: seaborn in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (0.13.2)
Requirement already satisfied: pyyaml in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (6.0.2)
Requirement already satisfied: jupyterlab in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (4.3.3)
Requirement already satisfied: nglview in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (3.1.4)
Requirement already satisfied: ipywidgets in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (8.1.5)
Requirement already satisfied: pylint in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (3.3.2)
Requirement already satisfied: autopep8 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (2.3.1)
Requirement already satisfied: pymatgen in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (2024.8.9)
Requirement already satisfied: parquetdb in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (0.23.4)
Requirement already satisfied: variconfig in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matgraphdb) (0.0.3)
Requirement already satisfied: pycodestyle>=2.12.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from autopep8->matgraphdb) (2.12.1)
Requirement already satisfied: tomli in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from autopep8->matgraphdb) (2.2.1)
Requirement already satisfied: comm>=0.1.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipywidgets->matgraphdb) (0.2.2)
Requirement already satisfied: ipython>=6.1.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipywidgets->matgraphdb) (8.18.1)
Requirement already satisfied: traitlets>=4.3.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipywidgets->matgraphdb) (5.14.3)
Requirement already satisfied: widgetsnbextension~=4.0.12 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipywidgets->matgraphdb) (4.0.13)
Requirement already satisfied: jupyterlab-widgets~=3.0.12 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipywidgets->matgraphdb) (3.0.13)
Requirement already satisfied: async-lru>=1.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (2.0.4)
Requirement already satisfied: httpx>=0.25.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (0.28.1)
Requirement already satisfied: importlib-metadata>=4.8.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (8.5.0)
Requirement already satisfied: ipykernel>=6.5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (6.29.5)
Requirement already satisfied: jinja2>=3.0.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (3.1.4)
Requirement already satisfied: jupyter-core in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (5.7.2)
Requirement already satisfied: jupyter-lsp>=2.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (2.2.5)
Requirement already satisfied: jupyter-server<3,>=2.4.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (2.14.2)
Requirement already satisfied: jupyterlab-server<3,>=2.27.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (2.27.3)
Requirement already satisfied: notebook-shim>=0.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (0.2.4)
Requirement already satisfied: packaging in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (24.2)
Requirement already satisfied: tornado>=6.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab->matgraphdb) (6.4.2)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (1.3.0)
Requirement already satisfied: cycler>=0.10 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (4.55.3)
Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (1.4.7)
Requirement already satisfied: pillow>=8 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (11.0.0)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (3.2.0)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (2.9.0.post0)
Requirement already satisfied: importlib-resources>=3.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from matplotlib->matgraphdb) (6.4.5)
Requirement already satisfied: notebook>=7 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nglview->matgraphdb) (7.3.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pandas->matgraphdb) (2024.2)
Requirement already satisfied: tzdata>=2022.7 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pandas->matgraphdb) (2024.2)
Requirement already satisfied: pyarrow in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (18.1.0)
Requirement already satisfied: beautifulsoup4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (4.12.3)
Requirement already satisfied: requests in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (2.32.3)
Requirement already satisfied: dill in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (0.3.9)
Requirement already satisfied: pathos in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (0.3.3)
Requirement already satisfied: dask in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (2024.8.0)
Requirement already satisfied: distributed in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from parquetdb->matgraphdb) (2024.8.0)
Requirement already satisfied: platformdirs>=2.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (4.3.6)
Requirement already satisfied: astroid<=3.4.0-dev0,>=3.3.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (3.3.6)
Requirement already satisfied: isort!=5.13.0,<6,>=4.2.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (5.13.2)
Requirement already satisfied: mccabe<0.8,>=0.6 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (0.7.0)
Requirement already satisfied: tomlkit>=0.10.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (0.13.2)
Requirement already satisfied: colorama>=0.4.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (0.4.6)
Requirement already satisfied: typing-extensions>=3.10.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pylint->matgraphdb) (4.12.2)
Requirement already satisfied: joblib>=1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (1.4.2)
Requirement already satisfied: monty>=2024.7.29 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (2024.10.21)
Requirement already satisfied: networkx>=2.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (3.2.1)
Requirement already satisfied: palettable>=3.3.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (3.3.3)
Requirement already satisfied: plotly>=4.5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (5.24.1)
Requirement already satisfied: pybtex>=0.24.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (0.24.0)
Requirement already satisfied: ruamel.yaml>=0.17.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (0.18.6)
Requirement already satisfied: spglib>=2.5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (2.5.0)
Requirement already satisfied: sympy>=1.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (1.13.2)
Requirement already satisfied: tabulate>=0.9 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (0.9.0)
Requirement already satisfied: tqdm>=4.60 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (4.67.1)
Requirement already satisfied: uncertainties>=3.1.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pymatgen->matgraphdb) (3.2.2)
Requirement already satisfied: exceptiongroup>=1.0.0rc8 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pytest->matgraphdb) (1.2.2)
Requirement already satisfied: iniconfig in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pytest->matgraphdb) (2.0.0)
Requirement already satisfied: pluggy<2,>=1.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pytest->matgraphdb) (1.5.0)
Requirement already satisfied: toml in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from variconfig->matgraphdb) (0.10.2)
Requirement already satisfied: anyio in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from httpx>=0.25.0->jupyterlab->matgraphdb) (4.7.0)
Requirement already satisfied: certifi in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from httpx>=0.25.0->jupyterlab->matgraphdb) (2024.8.30)
Requirement already satisfied: httpcore==1.* in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from httpx>=0.25.0->jupyterlab->matgraphdb) (1.0.7)
Requirement already satisfied: idna in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from httpx>=0.25.0->jupyterlab->matgraphdb) (3.7)
Requirement already satisfied: h11<0.15,>=0.13 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from httpcore==1.*->httpx>=0.25.0->jupyterlab->matgraphdb) (0.14.0)
Requirement already satisfied: zipp>=3.20 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from importlib-metadata>=4.8.3->jupyterlab->matgraphdb) (3.21.0)
Requirement already satisfied: debugpy>=1.6.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (1.8.11)
Requirement already satisfied: jupyter-client>=6.1.12 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (8.6.3)
Requirement already satisfied: matplotlib-inline>=0.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (0.1.7)
Requirement already satisfied: nest-asyncio in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (1.6.0)
Requirement already satisfied: psutil in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (6.1.0)
Requirement already satisfied: pyzmq>=24 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel>=6.5.0->jupyterlab->matgraphdb) (26.2.0)
Requirement already satisfied: decorator in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=6.1.0->ipywidgets->matgraphdb) (5.1.1)
Requirement already satisfied: jedi>=0.16 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=6.1.0->ipywidgets->matgraphdb) (0.19.2)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=6.1.0->ipywidgets->matgraphdb) (3.0.48)
Requirement already satisfied: pygments>=2.4.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=6.1.0->ipywidgets->matgraphdb) (2.18.0)
Requirement already satisfied: stack-data in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=6.1.0->ipywidgets->matgraphdb) (0.6.3)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jinja2>=3.0.3->jupyterlab->matgraphdb) (2.1.3)
Requirement already satisfied: pywin32>=300 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-core->jupyterlab->matgraphdb) (308)
Requirement already satisfied: argon2-cffi>=21.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (23.1.0)
Requirement already satisfied: jupyter-events>=0.9.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.10.0)
Requirement already satisfied: jupyter-server-terminals>=0.4.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.5.3)
Requirement already satisfied: nbconvert>=6.4.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (7.16.4)
Requirement already satisfied: nbformat>=5.3.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (5.10.4)
Requirement already satisfied: overrides>=5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (7.7.0)
Requirement already satisfied: prometheus-client>=0.9 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.21.1)
Requirement already satisfied: pywinpty>=2.0.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (2.0.14)
Requirement already satisfied: send2trash>=1.8.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.8.3)
Requirement already satisfied: terminado>=0.8.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.18.1)
Requirement already satisfied: websocket-client>=1.7 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.8.0)
Requirement already satisfied: babel>=2.10 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (2.16.0)
Requirement already satisfied: json5>=0.9.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (0.10.0)
Requirement already satisfied: jsonschema>=4.18.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (4.23.0)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from plotly>=4.5.0->pymatgen->matgraphdb) (9.0.0)
Requirement already satisfied: latexcodec>=1.0.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pybtex>=0.24.0->pymatgen->matgraphdb) (3.0.0)
Requirement already satisfied: six in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pybtex>=0.24.0->pymatgen->matgraphdb) (1.17.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from requests->parquetdb->matgraphdb) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from requests->parquetdb->matgraphdb) (1.26.20)
Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ruamel.yaml>=0.17.0->pymatgen->matgraphdb) (0.2.12)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from sympy>=1.2->pymatgen->matgraphdb) (1.3.0)
Requirement already satisfied: soupsieve>1.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from beautifulsoup4->parquetdb->matgraphdb) (2.6)
Requirement already satisfied: click>=8.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from dask->parquetdb->matgraphdb) (8.1.8)
Requirement already satisfied: cloudpickle>=1.5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from dask->parquetdb->matgraphdb) (3.1.0)
Requirement already satisfied: fsspec>=2021.09.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from dask->parquetdb->matgraphdb) (2024.12.0)
Requirement already satisfied: partd>=1.4.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from dask->parquetdb->matgraphdb) (1.4.2)
Requirement already satisfied: toolz>=0.10.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from dask->parquetdb->matgraphdb) (1.0.0)
Requirement already satisfied: locket>=1.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from distributed->parquetdb->matgraphdb) (1.0.0)
Requirement already satisfied: msgpack>=1.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from distributed->parquetdb->matgraphdb) (1.1.0)
Requirement already satisfied: sortedcontainers>=2.0.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from distributed->parquetdb->matgraphdb) (2.4.0)
Requirement already satisfied: tblib>=1.6.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from distributed->parquetdb->matgraphdb) (3.0.0)
Requirement already satisfied: zict>=3.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from distributed->parquetdb->matgraphdb) (3.0.0)
Requirement already satisfied: ppft>=1.7.6.9 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pathos->parquetdb->matgraphdb) (1.7.6.9)
Requirement already satisfied: pox>=0.3.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pathos->parquetdb->matgraphdb) (0.3.5)
Requirement already satisfied: multiprocess>=0.70.17 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from pathos->parquetdb->matgraphdb) (0.70.17)
Requirement already satisfied: sniffio>=1.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from anyio->httpx>=0.25.0->jupyterlab->matgraphdb) (1.3.1)
Requirement already satisfied: argon2-cffi-bindings in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (21.2.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets->matgraphdb) (0.8.4)
Requirement already satisfied: attrs>=22.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (24.2.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (2024.10.1)
Requirement already satisfied: referencing>=0.28.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema>=4.18.0->jupyterlab-server<3,>=2.27.1->jupyterlab->matgraphdb) (0.22.3)
Requirement already satisfied: python-json-logger>=2.0.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (3.2.0)
Requirement already satisfied: rfc3339-validator in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.1.4)
Requirement already satisfied: rfc3986-validator>=0.1.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.1.1)
Requirement already satisfied: bleach!=5.0.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (6.2.0)
Requirement already satisfied: defusedxml in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.7.1)
Requirement already satisfied: jupyterlab-pygments in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.3.0)
Requirement already satisfied: mistune<4,>=2.0.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (3.0.2)
Requirement already satisfied: nbclient>=0.5.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.10.1)
Requirement already satisfied: pandocfilters>=1.4.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.5.1)
Requirement already satisfied: tinycss2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.4.0)
Requirement already satisfied: fastjsonschema>=2.15 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from nbformat>=5.3.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (2.21.1)
Requirement already satisfied: wcwidth in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=6.1.0->ipywidgets->matgraphdb) (0.2.13)
Requirement already satisfied: executing>=1.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=6.1.0->ipywidgets->matgraphdb) (2.1.0)
Requirement already satisfied: asttokens>=2.1.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=6.1.0->ipywidgets->matgraphdb) (3.0.0)
Requirement already satisfied: pure-eval in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=6.1.0->ipywidgets->matgraphdb) (0.2.3)
Requirement already satisfied: webencodings in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from bleach!=5.0.0->nbconvert>=6.4.4->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (0.5.1)
Requirement already satisfied: fqdn in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.5.1)
Requirement already satisfied: isoduration in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (20.11.0)
Requirement already satisfied: jsonpointer>1.13 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (3.0.0)
Requirement already satisfied: uri-template in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.3.0)
Requirement already satisfied: webcolors>=24.6.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (24.11.1)
Requirement already satisfied: cffi>=1.0.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.17.1)
Requirement already satisfied: pycparser in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi>=21.1->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (2.22)
Requirement already satisfied: arrow>=0.15.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (1.3.0)
Requirement already satisfied: types-python-dateutil>=2.8.10 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from arrow>=0.15.0->isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->matgraphdb) (2.9.0.20241206)
Requirement already satisfied: ipykernel in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (6.29.5)
Requirement already satisfied: comm>=0.1.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (0.2.2)
Requirement already satisfied: debugpy>=1.6.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (1.8.11)
Requirement already satisfied: ipython>=7.23.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (8.18.1)
Requirement already satisfied: jupyter-client>=6.1.12 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (8.6.3)
Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (5.7.2)
Requirement already satisfied: matplotlib-inline>=0.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (0.1.7)
Requirement already satisfied: nest-asyncio in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (1.6.0)
Requirement already satisfied: packaging in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (24.2)
Requirement already satisfied: psutil in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (6.1.0)
Requirement already satisfied: pyzmq>=24 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (26.2.0)
Requirement already satisfied: tornado>=6.1 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (6.4.2)
Requirement already satisfied: traitlets>=5.4.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipykernel) (5.14.3)
Requirement already satisfied: decorator in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (5.1.1)
Requirement already satisfied: jedi>=0.16 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (0.19.2)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (3.0.48)
Requirement already satisfied: pygments>=2.4.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (2.18.0)
Requirement already satisfied: stack-data in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (0.6.3)
Requirement already satisfied: typing-extensions in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (4.12.2)
Requirement already satisfied: exceptiongroup in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (1.2.2)
Requirement already satisfied: colorama in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from ipython>=7.23.1->ipykernel) (0.4.6)
Requirement already satisfied: importlib-metadata>=4.8.3 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-client>=6.1.12->ipykernel) (8.5.0)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-client>=6.1.12->ipykernel) (2.9.0.post0)
Requirement already satisfied: platformdirs>=2.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-core!=5.0.*,>=4.12->ipykernel) (4.3.6)
Requirement already satisfied: pywin32>=300 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jupyter-core!=5.0.*,>=4.12->ipykernel) (308)
Requirement already satisfied: zipp>=3.20 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from importlib-metadata>=4.8.3->jupyter-client>=6.1.12->ipykernel) (3.21.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from jedi>=0.16->ipython>=7.23.1->ipykernel) (0.8.4)
Requirement already satisfied: wcwidth in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=7.23.1->ipykernel) (0.2.13)
Requirement already satisfied: six>=1.5 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from python-dateutil>=2.8.2->jupyter-client>=6.1.12->ipykernel) (1.17.0)
Requirement already satisfied: executing>=1.2.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=7.23.1->ipykernel) (2.1.0)
Requirement already satisfied: asttokens>=2.1.0 in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=7.23.1->ipykernel) (3.0.0)
Requirement already satisfied: pure-eval in c:\users\lllang\miniconda3\envs\matgraphdb_dev\lib\site-packages (from stack-data->ipython>=7.23.1->ipykernel) (0.2.3)

01 - Managing Graphs in ParuqetGraphDB

In this notebook, we’ll learn how to:

  1. Add new nodes and node types.

  2. Add new edges and edge types.

  3. Create node generators that automatically produce nodes based on a predefined function.

  4. Create edge generators that automatically produce edges based on a predefined function.

We’ll use the ParquetGraphDB class from parquetdb to demonstrate these features. If you haven’t already installed parquetdb, run the previous cell.

Setup

Here we will setup the directory structure where are test data will come from.

[1]:
from pathlib import Path
import shutil
import pandas as pd


FILE_DIR = Path(".")
DATA_DIR = FILE_DIR / "data"

if DATA_DIR.exists():
    shutil.rmtree(DATA_DIR)

DATA_DIR.mkdir(parents=True, exist_ok=True)

1. Initializing ParquetGraphDB

Here we will initialize the ParquetGraphDB instance. We can print out a summary of the database by directly printing the instance or calling the summary method. The summary has additional key word arguments that can be passed used to show additional information like column_names/fields.

[2]:
from parquetdb import ParquetGraphDB

# Create a temporary directory for our database
GRAPH_DB_DIR = DATA_DIR / "GraphDB"
if GRAPH_DB_DIR.exists():
    shutil.rmtree(GRAPH_DB_DIR)
GRAPH_DB_DIR.mkdir(parents=True, exist_ok=True)


# Initialize ParquetGraphDB
db = ParquetGraphDB(storage_path=GRAPH_DB_DIR)

print(db)
[INFO] 2025-04-30 08:10:23 - parquetdb.utils.config[37][load_config] - Config file: C:\Users\lllang\AppData\Local\parquetdb\parquetdb\config.yml
[INFO] 2025-04-30 08:10:23 - parquetdb.utils.config[41][load_config] - Setting data_dir to Z:\data\parquetdb\data
[INFO] 2025-04-30 08:10:24 - parquetdb.graph.parquet_graphdb[39][__init__] - Initializing GraphDB at root path: GraphDB
[INFO] 2025-04-30 08:10:24 - parquetdb.graph.parquet_graphdb[174][_load_existing_node_stores] - Loading existing node stores
[INFO] 2025-04-30 08:10:24 - parquetdb.graph.parquet_graphdb[203][_load_existing_stores] - Found 0 store types
[INFO] 2025-04-30 08:10:24 - parquetdb.graph.parquet_graphdb[182][_load_existing_edge_stores] - Loading existing edge stores
[INFO] 2025-04-30 08:10:24 - parquetdb.graph.parquet_graphdb[203][_load_existing_stores] - Found 0 store types
[INFO] 2025-04-30 08:10:24 - parquetdb.core.parquetdb[200][__init__] - Initializing ParquetDB with db_path: c:\Users\lllang\Desktop\Current_Projects\ParquetDB\examples\graphdb\GraphDB\edge_generators
[INFO] 2025-04-30 08:10:24 - parquetdb.core.parquetdb[202][__init__] - verbose: 1
============================================================
GRAPH DATABASE SUMMARY
============================================================
Name: GraphDB
Storage path: GraphDB
└── Repository structure:
    ├── nodes/                 (GraphDB\nodes)
    ├── edges/                 (GraphDB\edges)
    ├── edge_generators/       (GraphDB\edge_generators)
    ├── node_generators/       (GraphDB\node_generators)
    └── graph/                 (GraphDB\graph)

############################################################
NODE DETAILS
############################################################
Total node types: 0
------------------------------------------------------------

############################################################
EDGE DETAILS
############################################################
Total edge types: 0
------------------------------------------------------------

############################################################
NODE GENERATOR DETAILS
############################################################
Total node generators: 0
------------------------------------------------------------

############################################################
EDGE GENERATOR DETAILS
############################################################
Total edge generators: 0
------------------------------------------------------------

Each ParquetGraphDB is a directory containing the following:

  • nodes: A directory containing the node data. This will be where we store in nodes in the form of NodeStores, which extend ParquetDB class

  • edges: A directory containing the edge data. This will be where we store in edges in the form of EdgeStores, which extend ParquetDB class

  • node_generators: A directory containing the node generator data. This will be where we store node generator functions that can control the creation and update of children nodes that depend on parent nodes or edges

  • edge_generators: A directory containing the edge generator data. This will be where we store edge generator functions that can control the creation and update of children edges that depend on parent nodes or edges

  • generator_dependency.json: A json file containing the dependency graph of the generators. This is used to determine the order of execution of the generators.

As you can see, it is currently empty. Let’s add some nodes and edges to the database.

1. New Nodes

By default, no custom node types. You initialize a new node type in three ways:

  1. You can add your own node types via add_node_type(...), which creates an empty NodeStore for that type.

  2. You can add nodes directly by using the add_nodes(node_type, data) method and supply the node_type and data.

  3. You can add NodeStore instances directly by using the add_node_store(node_store) method.

Once the node type is initialized, the main method to add nodes is through the add_nodes(node_type, data) method.

Let’s initialize a new node type.

[3]:
# Add a node type called 'user'
custom_node_type = "users"

db.add_node_type(custom_node_type)

# These nodes will be stored in Parquet/nodes/users
print("Current node_stores:", list(db.node_stores.keys()))

print(db.summary(show_column_names=True))
Current node_stores: ['users']
============================================================
GRAPH DATABASE SUMMARY
============================================================
Name: GraphDB
Storage path: GraphDB
└── Repository structure:
    ├── nodes/                 (GraphDB\nodes)
    ├── edges/                 (GraphDB\edges)
    ├── edge_generators/       (GraphDB\edge_generators)
    ├── node_generators/       (GraphDB\node_generators)
    └── graph/                 (GraphDB\graph)

############################################################
NODE DETAILS
############################################################
Total node types: 1
------------------------------------------------------------
• Node type: users
  - Number of nodes: 0
  - Number of features: 1
  - Columns:
       - id
  - db_path: GraphDB\nodes\users
------------------------------------------------------------

############################################################
EDGE DETAILS
############################################################
Total edge types: 0
------------------------------------------------------------

############################################################
NODE GENERATOR DETAILS
############################################################
Total node generators: 0
------------------------------------------------------------

############################################################
EDGE GENERATOR DETAILS
############################################################
Total edge generators: 0
------------------------------------------------------------

The node store instances are stored in the node_stores attribute, which is a dictionary of node_type to NodeStore instances.

Now, when we print the summary, we see the node type is now included. Here, we also print the column names for the node store. You can see that we have the id column, which is the unique local identifier for the node. Any new instances of nodes will be assigned an id automatically.

Our database is currently empty, so let’s add some nodes to it.

Adding Nodes

As mentioned above, once a node type is registered, you can add nodes to it using the add_nodes(node_type, data) method. The data argument can take the following forms:

  1. list of dictionaries (Each dictionary represents a node)

  2. dictionary of arrays (Each key is a column name and each value is an array representing the column values for a node)

  3. pandas.DataFrame (Each row is a node)

  4. pyarrow.Table (Each row is a node)

Note: you can also automatically register a new node type by calling the add_nodes as well

Here, we’ll add data to the existing node type and add a new node type at the same time.

[4]:
# Add some user nodes
users = [
    {"name": "Jimmy"},
    {"name": "John"},
]

computers = [
    {
        "name": "Computer1",
        "specs": {"cpu": "AMD Ryzen 9", "ram": "32GB", "storage": "1TB"},
    },
    {
        "name": "Computer2",
        "specs": {"cpu": "Intel i7", "ram": "16GB", "storage": "512GB"},
    },
]

users_node_type = "users"
computers_node_type = "computers"

db.add_nodes(node_type=users_node_type, data=users)
db.add_nodes(node_type=computers_node_type, data=computers)

print(db.summary(show_column_names=True))
============================================================
GRAPH DATABASE SUMMARY
============================================================
Name: GraphDB
Storage path: GraphDB
└── Repository structure:
    ├── nodes/                 (GraphDB\nodes)
    ├── edges/                 (GraphDB\edges)
    ├── edge_generators/       (GraphDB\edge_generators)
    ├── node_generators/       (GraphDB\node_generators)
    └── graph/                 (GraphDB\graph)

############################################################
NODE DETAILS
############################################################
Total node types: 2
------------------------------------------------------------
• Node type: users
  - Number of nodes: 2
  - Number of features: 2
  - Columns:
       - id
       - name
  - db_path: GraphDB\nodes\users
------------------------------------------------------------
• Node type: computers
  - Number of nodes: 2
  - Number of features: 5
  - Columns:
       - id
       - name
       - specs.cpu
       - specs.ram
       - specs.storage
  - db_path: GraphDB\nodes\computers
------------------------------------------------------------

############################################################
EDGE DETAILS
############################################################
Total edge types: 0
------------------------------------------------------------

############################################################
NODE GENERATOR DETAILS
############################################################
Total node generators: 0
------------------------------------------------------------

############################################################
EDGE GENERATOR DETAILS
############################################################
Total edge generators: 0
------------------------------------------------------------

Great! Now we have two node types, users and computers, and we have added some nodes to each. As you can see the summary now includes the new node types with details about each node type.

Now, that we added some nodes, we need to know how to manage them.

Managing the node store

Once the data is registered, you can access it through the corresponding node store. You can get the node store either through the node_stores attribute or the get_node_store(node_type) method.

[5]:
computers_node_store = db.get_node_store(computers_node_type)
print(type(computers_node_store))
print(computers_node_store)


users_node_store = db.node_stores[users_node_type]
print(type(users_node_store))

print(users_node_store)
<class 'parquetdb.graph.nodes.NodeStore'>
============================================================
NODE STORE SUMMARY
============================================================
Node type: computers
• Number of nodes: 2
• Number of features: 5
Storage path: GraphDB\nodes\computers


############################################################
METADATA
############################################################
• class: NodeStore
• class_module: parquetdb.graph.nodes
• node_type: computers
• name_column: id

############################################################
NODE DETAILS
############################################################

<class 'parquetdb.graph.nodes.NodeStore'>
============================================================
NODE STORE SUMMARY
============================================================
Node type: users
• Number of nodes: 2
• Number of features: 2
Storage path: GraphDB\nodes\users


############################################################
METADATA
############################################################
• class: NodeStore
• class_module: parquetdb.graph.nodes
• node_type: users
• name_column: id

############################################################
NODE DETAILS
############################################################

Reading from the node store

There are multiple ways to read from the node store. You can use the read_nodes method from the MatGraphDB instance, you can use the read_nodes method from the NodeStore instance, or you can use the read method from the NodeStore instance. These reads methods behave very similarly as the read features introduced in the previous notebook, such as you can read columns using filters or columns

[9]:
import pyarrow.compute as pc

df = db.read_nodes(node_type=users_node_type).to_pandas()
print(df)

print("-"*100)

df = computers_node_store.read().to_pandas()

print(df)

print("-"*100)

# We can filter this similar to ParquetDB
df = computers_node_store.read(
    filters=[pc.field("specs.cpu") == "Intel i7"]
).to_pandas()

print(df)
print("-"*100)


# Notice if you rebuild the nested struct, the way you access the nested data is different
df = computers_node_store.read_nodes(
    columns=["name", "id", "specs"],
    filters=[pc.field("specs", "cpu") == "AMD Ryzen 9"],
    rebuild_nested_struct=True,
).to_pandas()
print(df)



   id   name
0   0  Jimmy
1   1   John
----------------------------------------------------------------------------------------------------
   id       name    specs.cpu specs.ram specs.storage
0   0  Computer1  AMD Ryzen 9      32GB           1TB
1   1  Computer2     Intel i7      16GB         512GB
----------------------------------------------------------------------------------------------------
   id       name specs.cpu specs.ram specs.storage
0   1  Computer2  Intel i7      16GB         512GB
----------------------------------------------------------------------------------------------------
        name  id                                              specs
0  Computer1   0  {'cpu': 'AMD Ryzen 9', 'ram': '32GB', 'storage...

Updating the node store

You can update the node store by using the update_nodes method from the ParquetGraphDB instance, or the update_nodes method from the NodeStore instance.

[10]:
computer_update_data = [
    {"name": "Computer1", "specs": {"ram": "128GB", "storage": "1TB"}},
    {"name": "Computer2", "specs": {"ram": "256GB", "storage": "2TB"}},
]

db.update_nodes(
    node_type=computers_node_type, data=computer_update_data, update_keys=["name"]
)

df = db.read_nodes(node_type=computers_node_type).to_pandas()
print(df)
   id       name    specs.cpu specs.ram specs.storage
0   0  Computer1  AMD Ryzen 9     128GB           1TB
1   1  Computer2     Intel i7     256GB           2TB

2. Adding New Edges

Edges are managed in the same way as nodes, but they are stored in the EdgeStore instance. EdgeStores differ from NodeStores as they have to store the source and target node ids, as well as the edge type. These must be specified to add an edge.

You can create a new edge type using add_edge_type(edge_type). Then, you can add edges by calling add_edges(edge_type, data).

  • source_id and source_type

  • target_id and target_type

The ids and types must match the node types and ids nodes in MatGraphDB.

[41]:
# Add edge type
edge_type_test = "user_access"

# We'll connect the 'user' nodes to the 'item' nodes
edge_data = [
    {
        "source_id": 0,  # This is the id of the user node
        "source_type": users_node_type,
        "target_id": 0,  # This is the id of the computer node
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "Jimmy has access to Computer1",
    },
    {
        "source_id": 0,  # This is the id of the user node
        "source_type": users_node_type,
        "target_id": 1,  # This is the id of the computer node
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "Jimmy has access to Computer2",
    },
    {
        "source_id": 1,
        "source_type": users_node_type,
        "target_id": 1,
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "John has access to Computer2",
    },
    {
        "source_id": 0,
        "source_type": computers_node_type,
        "target_id": 1,
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "Computer1 has access to Computer2",
    },
    {
        "source_id": 1,
        "source_type": computers_node_type,
        "target_id": 0,
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "Computer2 has access to Computer1",
    },
    {
        "source_id": 0,
        "source_type": computers_node_type,
        "target_id": 0,
        "target_type": computers_node_type,
        "edge_type": edge_type_test,
        "name": "Computer1 has access to Computer1",
        "extra_detail": "This is the main computer",
    },
]

db.add_edges(edge_type=edge_type_test, data=edge_data)

edges = db.read_edges(edge_type=edge_type_test)
print("Number of edges of type 'test_edge':", len(edges))
df_edges = edges.to_pandas()
print(df_edges)
Number of edges of type 'test_edge': 6
     edge_type               extra_detail  id  \
0  user_access                       None   0
1  user_access                       None   1
2  user_access                       None   2
3  user_access                       None   3
4  user_access                       None   4
5  user_access  This is the main computer   5

                                name  source_id source_type  target_id  \
0      Jimmy has access to Computer1          0       users          0
1      Jimmy has access to Computer2          0       users          1
2       John has access to Computer2          1       users          1
3  Computer1 has access to Computer2          0   computers          1
4  Computer2 has access to Computer1          1   computers          0
5  Computer1 has access to Computer1          0   computers          0

  target_type
0   computers
1   computers
2   computers
3   computers
4   computers
5   computers

In this example we have defined the computer access edges between users and computers. Note that we can specify self-edges and directionality of the edges by choosing which node is the source and which is the target.

Also we are free to add additional columns/features to the edges, such as extra_detail in this case.

Updating the edges

You can update the edges by using the update_edges method from the MatGraphDB instance, or the update_edges method from the EdgeStore instance.

[42]:
update_data = [
    {"id": 0, "weight": 1.0},
    {"id": 1, "weight": 1.0},
]

db.update_edges(edge_type=edge_type_test, data=update_data)

edges = db.read_edges(
    edge_type=edge_type_test, columns=["id", "source_id", "target_id", "weight", "name"]
).to_pandas()
print("Number of edges of type 'test_edge':", len(edges))
print(edges)
Number of edges of type 'test_edge': 6
   id  source_id  target_id  weight                               name
0   0          0          0     1.0      Jimmy has access to Computer1
1   1          0          1     1.0      Jimmy has access to Computer2
2   2          1          1     NaN       John has access to Computer2
3   3          0          1     NaN  Computer1 has access to Computer2
4   4          1          0     NaN  Computer2 has access to Computer1
5   5          0          0     NaN  Computer1 has access to Computer1

We can also update by specifying the source and target ids and types. To do this we need to specify source_id, target_id, source_type, and target_type in the update_keys argument.

[43]:
update_data = [
    {
        "source_id": 0,
        "source_type": users_node_type,
        "target_id": 0,
        "target_type": computers_node_type,
        "weight": 0.5,
    },
]

db.update_edges(
    edge_type=edge_type_test,
    data=update_data,
    update_keys=["source_id", "target_id", "source_type", "target_type"],
)


edges = db.read_edges(
    edge_type=edge_type_test, columns=["id", "source_id", "target_id", "weight", "name"]
).to_pandas()
print("Number of edges of type 'test_edge':", len(edges))
print(edges)
Number of edges of type 'test_edge': 6
   id  source_id  target_id  weight                               name
0   0          0          0     0.5      Jimmy has access to Computer1
1   1          0          1     1.0      Jimmy has access to Computer2
2   2          1          1     NaN       John has access to Computer2
3   3          0          1     NaN  Computer1 has access to Computer2
4   4          1          0     NaN  Computer2 has access to Computer1
5   5          0          0     NaN  Computer1 has access to Computer1
[44]:
print(db)
============================================================
GRAPH DATABASE SUMMARY
============================================================
Name: GraphDB
Storage path: GraphDB
└── Repository structure:
    ├── nodes/                 (GraphDB\nodes)
    ├── edges/                 (GraphDB\edges)
    ├── edge_generators/       (GraphDB\edge_generators)
    ├── node_generators/       (GraphDB\node_generators)
    └── graph/                 (GraphDB\graph)

############################################################
NODE DETAILS
############################################################
Total node types: 2
------------------------------------------------------------
• Node type: users
  - Number of nodes: 2
  - Number of features: 2
  - db_path: GraphDB\nodes\users
------------------------------------------------------------
• Node type: computers
  - Number of nodes: 2
  - Number of features: 5
  - db_path: GraphDB\nodes\computers
------------------------------------------------------------

############################################################
EDGE DETAILS
############################################################
Total edge types: 1
------------------------------------------------------------
• Edge type: user_access
  - Number of edges: 6
  - Number of features: 9
  - db_path: GraphDB\edges\user_access
------------------------------------------------------------

############################################################
NODE GENERATOR DETAILS
############################################################
Total node generators: 0
------------------------------------------------------------

############################################################
EDGE GENERATOR DETAILS
############################################################
Total edge generators: 0
------------------------------------------------------------

Conclusion

In this notebook, we explored the process of managing graphs using ParquetGraphDB. Specifically, we:

  • Added new node types and registered nodes within those types.

  • Learned how to create and manage edge types, including adding and updating edges.

  • Explored the functionality of reading and updating data from both node and edge stores.

These capabilities form the foundation for representing and manipulating complex graph-based data efficiently.

What’s Next?

In the next notebook, we will go into adding node and edge generators. Generators allow the creation of nodes and edges dynamically based on predefined functions. This allows ParquetGraphDB to propagate updates to dependent nodes and edges if there are any changes to the parent nodes or edges.