Skip to content

big-data-team/python-course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

The repository is used during the following courses:

  1. Effective Python
  1. Python for Big Data

Environment Configuration

  1. Download requirements.txt
  2. Create environment:
export env_name="bdt-python-course"
conda create -n $env_name python=3.10
conda activate $env_name
# there are packages that no more supported by conda
# so, intead of this:
conda install --file requirements.txt
# call directly pip:
pip install -r requirements.txt

For more information about Python virtual environments see:

See available conda environments with the help of:

conda info --envs

If you need to remove environment use the following command:

conda remove --name $env_name --all

HowTos

How to use pylint:

pylint --output-format=colorized -v inverted_index.py
# in case you would like to ignore some warnings:
pylint --output-format=colorized -d C0111,C0103 -v inverted_index.py
pylint --output-format=colorized -d invalid-name,missing-docstring -v inverted_index.py

How to use pytest:

pytest -v .
pytest --cov -v .
pytest --cov -vv --durations=0 .

Study Datasets

  1. Wikipedia sample - do not forget to unzip after download
  2. Stop words
  3. DVC experimental data - do not forget to unzip after download
  4. Stackoverflow posts dump sample (XML)
  5. Pandas HW data

Study Artefacts

  1. Flask 404 template

Contributors

About

Course on how to write clean, maintainable and scalable code on Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5