The repository is used during the following courses:
- Effective Python
- How to write clean, maintainable and scalable code on Python
- https://bigdatateam.org/ru/python-course
- Python for Big Data
- unix CLI
- data analysis with regex, pandas and SQL
- reproducible research
- computer science fundamentals: data structures, algorithms, Big O notation (complexity)
- https://bigdatateam.org/ru/python-for-big-data-analysis
- Download requirements.txt
- Create environment:
export env_name="bdt-python-course"
conda create -n $env_name python=3.10
conda activate $env_name
# there are packages that no more supported by conda
# so, intead of this:
conda install --file requirements.txt
# call directly pip:
pip install -r requirements.txt
For more information about Python virtual environments see:
- https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/
- https://docs.python.org/3/library/site.html
- https://docs.python.org/3/library/venv.html
See available conda environments with the help of:
conda info --envs
If you need to remove environment use the following command:
conda remove --name $env_name --all
How to use pylint:
pylint --output-format=colorized -v inverted_index.py
# in case you would like to ignore some warnings:
pylint --output-format=colorized -d C0111,C0103 -v inverted_index.py
pylint --output-format=colorized -d invalid-name,missing-docstring -v inverted_index.py
How to use pytest:
pytest -v .
pytest --cov -v .
pytest --cov -vv --durations=0 .
- Wikipedia sample - do not forget to unzip after download
- Stop words
- DVC experimental data - do not forget to unzip after download
- Stackoverflow posts dump sample (XML)
- Pandas HW data