Posted on Leave a comment

MLCube and Podman

MLCube is a new open source container based infrastructure specification introduced to enable reproducibility in Python based machine learning workflows. It can utilize tools such as Podman, Singularity and Docker. Execution on remote platforms is also supported. One of the chairs of the MLCommons Best Practices working group that is developing MLCube is Diane Feddema from Red Hat. This introductory article explains how to run the hello world MLCube example using Podman on Fedora Linux.

Yazan Monshed has written a very helpful introduction to Podman on Fedora which gives more details on some of the steps used here.

First install the necessary dependencies.

sudo dnf -y update
sudo dnf -y install podman git virtualenv \ policycoreutils-python-utils

Then, following the documentation, setup a virtual environment and get the example code. To ensure reproducibility, use a specific commit as the project is being actively improved.

virtualenv -p python3 ./env_mlcube source ./env_mlcube/bin/activate
git clone https://github.com/mlcommons/mlcube_examples.git cd ./mlcube_examples/hello_world
git checkout 5fe69bd
pip install mlcube mlcube-docker
mlcube describe

Now change the runner command from docker to podman by editing the file $HOME/mlcube.yaml so that the line

docker: docker

becomes

docker: podman

If you are on a computer with x86_64 architecture, you can get the container using

mlcube configure --mlcube=. --platform=docker

You will see a number of options

? Please select an image: ▸ registry.fedoraproject.org/mlcommons/hello_world:0.0.1 registry.access.redhat.com/mlcommons/hello_world:0.0.1 docker.io/mlcommons/hello_world:0.0.1 quay.io/mlcommons/hello_world:0.0.1

Choose docker.io/mlcommons/hello_world:0.0.1 to obtain the container.

If you are not on a computer with x86_64 architecture, you will need to build the container. Change the file $HOME/mlcube.yaml so that the line

build_strategy: pull

becomes

build_strategy: auto

and then build the container using

mlcube configure --mlcube=. --platform=docker

To run the tests, you may need to set SELinux permissions in the directories appropriately. You can check that SELinux is enabled by typing

sudo sestatus

which should give you output similar to

SELinux status: enabled
...

Josphat Mutai, Christopher Smart and Daniel Walsh explain that you need to be careful in setting appropriate SELinux policies for files used by containers. Here, you will allow the container to read and write to the workspace directory.

sudo semanage fcontext -a -t container_file_t "$PWD/workspace(/.*)?"
sudo restorecon -Rv $PWD/workspace

Now check the directory policy by checking that

ls -Z

gives output similar to

unconfined_u:object_r:user_home_t:s0 Dockerfile
unconfined_u:object_r:user_home_t:s0 README.md
unconfined_u:object_r:user_home_t:s0 mlcube.yaml
unconfined_u:object_r:user_home_t:s0 requirements.txt
unconfined_u:object_r:container_file_t:s0 workspace

Now run the example

mlcube run --mlcube=. --task=hello --platform=docker
mlcube run --mlcube=. --task=bye --platform=docker

Finally, check that the output

cat workspace/chats/chat_with_alice.txt

has text similar to

Hi, Alice! Nice to meet you.
Bye, Alice! It was great talking to you.

You can create your own MLCube as described here. Contributions to the MLCube examples repository are welcome. Udica is a new project that promises more fine grained SELinux policy controls for containers that are easy for system administrators to apply. Active development of these projects is ongoing. Testing and providing feedback on them would help make secure data management on systems with SELinux easier and more effective.

Leave a Reply