Commit 2a93ff1d authored by Jenni Rinker's avatar Jenni Rinker

adding more text

parent a024806f
# Using Data Version Control with git
Testing using data version control for data with git
\ No newline at end of file
Testing using data version control for data with git
## What is DVC?
Data version control (DVC) is the cool concept of tracking data files
in a similar way that we track versions using git. Read more at [](
## This test
There is an Excel file in the `data/` folder that is pushed/pulled
from a shared Google drive. There is a test Python script in `code/`
that prints a single cell from the Excel file. When configuring data
with DVC, it adds a gitignore file to ignore the data file itself, but
it adds text files with info that are then git-tracked.
To run this tou will need:
* to have DVC installed on your machine
* to have an Anaconda environment with `pandas` installed
* A google account you will use for accessing the shared data
* to ask Jenni for access to the shared Google drive used for testing
(send google email)
To get the data (assuming you've been granted access to the Google Drive
folder), follow the section below about updating local versions of data.
## DVC workflow
Here are some example workflows, but they might be broken. The documentation
at []( is a great reference.
### Pulling data from remote (update local)
If your colleague has pushed data and you want to pull it.
The git command that the data has been committed to master branch.
git pull origin master
dvc pull
Now you should have an updated version of the data in your repository.
### Push data to remote (update server)
You've regenerated and need to push the results so your colleagues can
pull and analyse them.
The git command assumes you are committing directly on master branch
dvc add data/data.xlsx
git commit data/data.xlsx.dvc -m "Dataset updates"
dvc push
git push origin master
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment