A scikit-learn Pipeline model

chicago-model-python

Model age

54 days old

Model details

  • The city of Chicago offers access to health code inspections of restaurants, available from the Chicago Department of Public Health. This model looks to predict inspection outcome from ['facility_type', 'risk', 'total_violations', 'month', 'year'] features.
  • The model deployed is a scikit-learn Pipeline involving an encoder for categorical variables and a RandomForestClassifier to make predictions.

Intended use

  • The primary intended users of this model are people who are interested in health inspection data from Chicago
  • Some use cases are out of scope for this model, such as using this model for real world health inspection prediction

Training data & evaluation data

  • The training dataset for this model has the prototype:
{'facility_type': {'default': 'RESTAURANT',
  'title': 'Facility Type',
  'type': 'string'},
 'risk': {'default': 'RISK 1 (HIGH)', 'title': 'Risk', 'type': 'string'},
 'total_violations': {'default': 31.0,
  'title': 'Total Violations',
  'type': 'number'},
 'month': {'default': 11, 'title': 'Month', 'type': 'integer'},
 'year': {'default': 2019, 'title': 'Year', 'type': 'integer'}}

Ethical considerations

  • This model does not have personal data and is not used for production. However, health inspections have other inputs that affect failure that are not tracked in this model that should be taken into account for production use cases.

Caveats & recommendations

  • This model does show real world health inspection data for the city of Chicago.
  • This model was not made with the intention of creating the best model possible, but rather, is best used for exploration of data and the tools used to create this dashboard.

Model performance over time. In this context, performance is the statistical properties of the model, specifically, accuracy and recall. The data is grouped by week, starting in January of 2023 until July of 2023.

index n metric estimate
Loading... (need help?)

Inspections that our model misclassified, in either direction.

results preds facility_type risk aka_name inspection_date
Loading... (need help?)