Stack Exchange contains questions and answers that can be up or downvoted. The sites datascience.stackexchange and Stack Overflow look useful for our data science goals.
The tables with the highest number of tag counts would be more promising towards finding the most popular content. The include tags for machine learning and building models. Also Posts, Tags, AnswerCount, CommentCount, FavoriteCount, UpVotes, and PostTags.
#SELECT Id,
# CreationDate,
# Score,
# ViewCount,
# Tags,
# AnswerCount,
# FavoriteCount
#FROM Posts
#WHERE PostTypeID = 1 AND YEAR(CreationDate) = 2019;
import pandas as pd
questions = pd.read_csv("2019_questions.csv", parse_dates = ["CreationDate"])
questions.head()
Id | CreationDate | Score | ViewCount | Tags | AnswerCount | FavoriteCount | |
---|---|---|---|---|---|---|---|
0 | 44419 | 2019-01-23 09:21:13 | 1 | 21 | <machine-learning><data-mining> | 0 | NaN |
1 | 44420 | 2019-01-23 09:34:01 | 0 | 25 | <machine-learning><regression><linear-regressi... | 0 | NaN |
2 | 44423 | 2019-01-23 09:58:41 | 2 | 1651 | <python><time-series><forecast><forecasting> | 0 | NaN |
3 | 44427 | 2019-01-23 10:57:09 | 0 | 55 | <machine-learning><scikit-learn><pca> | 1 | NaN |
4 | 44428 | 2019-01-23 11:02:15 | 0 | 19 | <dataset><bigdata><data><speech-to-text> | 0 | NaN |
questions.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8839 entries, 0 to 8838 Data columns (total 7 columns): Id 8839 non-null int64 CreationDate 8839 non-null datetime64[ns] Score 8839 non-null int64 ViewCount 8839 non-null int64 Tags 8839 non-null object AnswerCount 8839 non-null int64 FavoriteCount 1407 non-null float64 dtypes: datetime64[ns](1), float64(1), int64(4), object(1) memory usage: 483.5+ KB
FavoriteCount is missing values. In order to get them, we'd have to look at 7400 questions. This is not practical. Also, it should be an integer, not a float. The types of the remaining columns are reasonable. In Tags it would be helpful to remove the <> and study the most common tags.
questions.isnull().sum()
Id 0 CreationDate 0 Score 0 ViewCount 0 Tags 0 AnswerCount 0 FavoriteCount 7432 dtype: int64
The above confirms that FavoriteCount has 7432 missing values.
questions = questions.fillna(0)
questions.isnull().sum()
Id 0 CreationDate 0 Score 0 ViewCount 0 Tags 0 AnswerCount 0 FavoriteCount 0 dtype: int64
We have just replaced the null ("NaN") missing FavoriteCount values with 0. Now we will change the data type for this column to an integer.
questions['FavoriteCount'] = questions['FavoriteCount'].astype(int)
questions.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8839 entries, 0 to 8838 Data columns (total 7 columns): Id 8839 non-null int64 CreationDate 8839 non-null datetime64[ns] Score 8839 non-null int64 ViewCount 8839 non-null int64 Tags 8839 non-null object AnswerCount 8839 non-null int64 FavoriteCount 8839 non-null int64 dtypes: datetime64[ns](1), int64(5), object(1) memory usage: 483.5+ KB
questions['Tags'].head(5)
0 <machine-learning><data-mining> 1 <machine-learning><regression><linear-regressi... 2 <python><time-series><forecast><forecasting> 3 <machine-learning><scikit-learn><pca> 4 <dataset><bigdata><data><speech-to-text> Name: Tags, dtype: object
questions['Tags'] = questions['Tags'].str.replace("^<|>$", "")
questions['Tags'].head(5)
0 machine-learning><data-mining 1 machine-learning><regression><linear-regressio... 2 python><time-series><forecast><forecasting 3 machine-learning><scikit-learn><pca 4 dataset><bigdata><data><speech-to-text Name: Tags, dtype: object
questions['Tags'] = questions['Tags'].str.split("><")
questions['Tags'].head(5)
0 [machine-learning, data-mining] 1 [machine-learning, regression, linear-regressi... 2 [python, time-series, forecast, forecasting] 3 [machine-learning, scikit-learn, pca] 4 [dataset, bigdata, data, speech-to-text] Name: Tags, dtype: object
questions.head(3)
Id | CreationDate | Score | ViewCount | Tags | AnswerCount | FavoriteCount | |
---|---|---|---|---|---|---|---|
0 | 44419 | 2019-01-23 09:21:13 | 1 | 21 | [machine-learning, data-mining] | 0 | 0 |
1 | 44420 | 2019-01-23 09:34:01 | 0 | 25 | [machine-learning, regression, linear-regressi... | 0 | 0 |
2 | 44423 | 2019-01-23 09:58:41 | 2 | 1651 | [python, time-series, forecast, forecasting] | 0 | 0 |
num_tags = {}
for tags in questions['Tags']:
for tag in tags:
if tag in num_tags:
num_tags[tag] += 1
else:
num_tags[tag] = 1
print(num_tags)
{'pandas': 354, 'scalability': 4, 'probability': 76, 'chatbot': 14, 'kendalls-tau-coefficient': 1, 'cloud-computing': 9, 'colab': 18, 'refit-model': 1, 'beginner': 27, 'orange': 64, 'bayesian-nonparametric': 2, 'xboost': 1, 'math': 37, 'serialisation': 3, 'ai': 25, 'nltk': 43, 'actor-critic': 21, 'tools': 8, 'estimators': 8, 'rmsle': 1, 'tsne': 15, 'similarity': 72, 'multilabel-classification': 92, 'mean-shift': 2, 'multitask-learning': 7, 'redshift': 1, 'kaggle': 43, 'community': 1, 'time': 5, 'code': 5, 'mathematics': 17, 'gan': 85, 'generalization': 12, 'label-flipping': 1, 'azure-ml': 12, 'dirichlet': 4, 'software-recommendation': 4, 'lbp': 2, 'consumerweb': 1, 'data.table': 4, 'bioinformatics': 4, 'multi-output': 7, 'ipython': 18, 'seaborn': 38, 'management': 2, 'text-generation': 17, 'cloud': 6, 'convnet': 111, 'sensors': 5, 'methodology': 10, 'density-estimation': 3, 'mnist': 23, 'education': 3, 'tesseract': 3, 'siamese-networks': 4, 'pca': 85, 'haar-cascade': 1, 'learning-to-rank': 6, 'summarunner-architecture': 1, 'lstm': 402, 'terminology': 16, 'scraping': 5, 'amazon-ml': 1, 'smotenc': 4, 'nvidia': 7, 'parameter-estimation': 6, 'normalization': 74, 'discounted-reward': 5, 'gmm': 2, 'c': 4, 'weighted-data': 14, 'yolo': 21, 'tfidf': 31, 'anova': 2, 'search': 5, 'automation': 4, 'meta-learning': 3, 'bayesian': 40, 'ann': 2, 'efficiency': 2, 'octave': 4, 'imbalanced-learn': 21, 'ab-test': 6, 'market-basket-analysis': 12, 'topic-model': 31, 'naive-bayes-classifier': 42, 'torch': 4, 'matrix': 22, 'feature-extraction': 87, 'data-leakage': 8, 'generative-models': 46, 'siamese': 1, 'prediction': 128, 'nn': 1, 'bayesian-networks': 12, 'sql': 29, 'q-learning': 37, 'discriminant-analysis': 5, 'word-embeddings': 117, 'neural': 16, 'batch-normalization': 29, 'google-cloud': 1, 'categories': 2, 'inception': 10, 'linear-regression': 175, 'feature-selection': 209, 'counts': 3, 'kernel': 27, 'helmert-coding': 1, 'pipelines': 17, 'etl': 6, 'pac-learning': 6, 'deep-learning': 1220, 'spyder': 1, 'random-forest': 159, 'bias': 19, 'implementation': 9, 'multivariate-distribution': 1, 'sematic-similarity': 2, 'jaccard-coefficient': 4, 'counter-inference': 1, 'text': 41, 'sequential-pattern-mining': 17, 'javascript': 8, 'gaussian': 20, 'grid-search': 35, 'anonymization': 3, 'impala': 1, 'mutual-information': 5, 'monte-carlo': 15, 'clustering': 257, 'stemming': 2, 'historgram': 7, 'spacy': 20, 'marginal-effects': 1, 'orange3': 20, 'correlation': 80, 'class-imbalance': 73, 'gensim': 36, 'openai-gpt': 2, 'mse': 8, 'pytorch-geometric': 2, 'descriptive-statistics': 21, 'parquet': 1, 'manhattan': 3, 'k-means': 81, 'dummy-variables': 19, 'missing-data': 43, 'activation': 1, 'bayes-error': 1, 'pip': 4, 'dynamic-programming': 3, 'one-shot-learning': 2, 'paperspace': 1, 'gru': 1, 'unseen-data': 1, 'metadata': 2, 'dump': 1, 'matrix-factorisation': 24, 'dbscan': 18, 'policy-gradients': 27, 'geospatial': 27, 'named-entity-recognition': 36, 'adaboost': 1, 'cosine-distance': 21, 'distribution': 57, 'non-convex': 1, 'predictive-modeling': 265, 'classifier': 18, 'forecast': 34, 'noisification': 1, 'keras': 935, 'markov': 4, 'dataframe': 81, 'model-selection': 58, 'processing': 5, 'dialog-flow': 2, 'project-planning': 6, 'opencv': 39, 'recurrent-neural-net': 91, 'infographics': 2, 'transformer': 45, 'google': 17, 'glm': 3, 'hurdle-model': 1, 'genetic-programming': 2, 'data-transfer': 1, 'books': 7, 'competitions': 2, 'distance': 44, 'vector-space-models': 7, 'hyperparameter': 42, 'rbm': 4, 'doc2vec': 3, 'data-augmentation': 24, 'annotation': 12, 'heatmap': 9, 'aws': 20, 'vae': 14, 'probabilistic-programming': 9, 'association-rules': 19, 'reshape': 9, 'numerical': 6, 'relational-dbms': 7, 'dimensionality-reduction': 69, 'parameter': 5, 'fuzzy-classification': 3, 'data-science-model': 186, 'networkx': 2, 'databases': 29, 'open-set': 2, 'movielens': 2, 'twitter': 8, 'recommender-system': 103, 'expectation-maximization': 5, 'sampling': 38, 'indexing': 6, 'reinforcement-learning': 203, 'experiments': 3, 'survival-analysis': 10, 'r': 268, 'homework': 4, 'computer-vision': 121, 'crawling': 3, 'pattern-recognition': 1, 'csv': 27, 'lasso': 8, 'learning': 10, 'finance': 17, 'lightgbm': 23, 'gbm': 10, 'ml': 7, 'sequence': 25, 'activity-recognition': 5, 'image-preprocessing': 67, 'keras-rl': 6, 'social-network-analysis': 11, 'wolfram-language': 3, 'career': 9, 'optimization': 124, 'boosting': 49, 'gridsearchcv': 28, 'smote': 27, 'epochs': 11, 'sports': 3, 'text-mining': 113, 'encoding': 54, 'data-cleaning': 157, 'word2vec': 88, 'embeddings': 44, 'scipy': 40, 'gradient-descent': 98, 'sparsity': 2, 'unsupervised-learning': 110, 'web-scrapping': 8, 'software-development': 2, 'learning-rate': 8, 'ensemble-modeling': 30, 'regularization': 50, 'xgboost': 165, 'mini-batch-gradient-descent': 10, 'word': 2, 'dqn': 36, 'cnn': 489, 'interpolation': 6, 'google-prediction-api': 2, 'aggregation': 12, 'hardware': 12, 'numpy': 117, 'research': 11, 'ndcg': 5, 'spearmans-rank-correlation': 1, 'softmax': 24, 'outlier': 48, 'arima': 11, 'feature-reduction': 4, 'kitti-dataset': 1, 'variance': 35, 'parallel': 8, 'hyperparameter-tuning': 59, 'anomaly-detection': 92, 'corpus': 1, 'probability-calibration': 11, 'pickle': 9, 'linear-algebra': 24, 'groupby': 2, 'data-product': 3, 'question-answering': 4, 'language-model': 25, 'object-recognition': 14, 'statsmodels': 1, 'pgm': 1, 'objective-function': 4, 'alex-net': 5, 'features': 32, 'reference-request': 18, 'version-control': 1, 'encoder': 1, 'sentiment-analysis': 37, 'library': 2, 'autoencoder': 106, 'cost-function': 25, 'python': 1814, 'least-squares-svm': 1, 'data-stream-mining': 4, 'structured-data': 5, 'confusion-matrix': 27, 'classification': 685, 'data-mining': 217, 'overfitting': 69, 'rnn': 149, 'momentum': 3, 'collinearity': 6, 'nlp': 493, 'audio-recognition': 25, 'vc-theory': 5, 'manifold': 1, 'anaconda': 20, 'machine-learning-model': 224, 'pooling': 4, 'simulation': 11, 'neural-network': 1055, 'image-size': 6, 'julia': 2, 'image-recognition': 86, 'image-segmentation': 3, 'graphs': 47, 'supervised-learning': 82, 'notation': 4, 'deepmind': 7, 'openai-gym': 17, 'explainable-ai': 10, 'logistic-regression': 154, 'time-series': 466, 'vgg16': 21, 'tableau': 9, 'deep-network': 29, 'image-classification': 211, 'information-retrieval': 32, 'sequence-to-sequence': 35, 'svm': 136, 'graphical-model': 3, 'nlg': 9, 'faster-rcnn': 38, 'one-hot-encoding': 4, 'linux': 5, 'decision-trees': 145, 'data-formats': 9, 'usecase': 2, 'neural-style-transfer': 8, 'weka': 19, 'bigdata': 95, 'regression': 347, 'similar-documents': 20, 'pearsons-correlation-coefficient': 2, 'cause-effect-relations': 1, 'randomized-algorithms': 6, 'convolution': 103, 'evaluation': 66, 'markov-hidden-model': 13, 'scikit-learn': 540, 'history': 1, 'non-parametric': 3, 'hog': 1, 'java': 14, 'fastai': 6, 'gpu': 42, 'multiclass-classification': 131, 'svr': 5, 'noise': 17, 'nl2sql': 1, 'self-driving': 3, 'causalimpact': 2, 'loss-function': 161, 'perceptron': 26, 'weight-initialization': 12, 'marketing': 6, 'open-source': 1, 'stanford-nlp': 9, 'frequentist': 1, 'automl': 2, 'normal-equation': 1, 'rbf': 5, 'representation': 9, 'pytorch': 175, 'active-learning': 4, 'dropout': 15, 'forecasting': 85, '3d-object-detection': 1, 'machine-translation': 28, 'regex': 8, 'matplotlib': 77, 'predict': 3, 'data-analysis': 71, 'markov-process': 14, 'mlp': 34, 'machine-learning': 2693, 'scoring': 12, 'hinge-loss': 7, 'object-detection': 109, 'exploitation': 1, 'data-indexing-techniques': 1, 'privacy': 6, 'attention-mechanism': 26, 'data-wrangling': 15, 'distributed': 7, 'pyspark': 40, 'theano': 4, 'text-classification': 1, 'churn': 15, 'data': 213, 'training': 148, 'knime': 1, 'anomaly': 4, 'score': 14, 'finite-precision': 2, 'theory': 11, 'excel': 24, 'cross-validation': 139, 'apache-nifi': 1, '3d-reconstruction': 9, 'clusters': 10, 'unbalanced-classes': 42, 'algorithms': 68, 'feature-engineering': 163, 'powerbi': 10, 'convergence': 17, 'categorical-data': 81, 'rmse': 1, 'gaussian-process': 12, 'parsing': 3, 'matlab': 62, 'error-handling': 17, 'plotting': 32, 'ensemble': 7, 'nosql': 3, 'state-of-the-art': 1, 'dataset': 340, 'categorical-encoding': 3, 'libsvm': 1, 'genetic-algorithms': 16, 'ensemble-learning': 11, 'text-filter': 2, 'mongodb': 2, 'label-smoothing': 1, 'hive': 2, 'evolutionary-algorithms': 11, 'predictor-importance': 9, 'speech-to-text': 8, 'self-study': 8, 'apache-spark': 35, 'feature-scaling': 59, 'wikipedia': 1, '.net': 1, 'labels': 28, 'coursera': 3, 'apache-hadoop': 13, 'information-theory': 9, 'game': 7, 'caffe': 7, 'python-3.x': 13, 'lda': 27, 'map-reduce': 3, 'james-stein-encoder': 1, 'sagemaker': 8, 'dplyr': 6, 'backpropagation': 65, 'accuracy': 89, 'statistics': 234, 'ggplot2': 3, 'fuzzy-logic': 13, 'bert': 64, 'ibm-watson': 1, 'visualization': 126, 'tensorflow': 584, 'auc': 3, 'pathfinder': 1, 'feature-construction': 16, 'allennlp': 2, 'ranking': 22, 'programming': 7, 'sas': 6, 'spss': 2, 'metric': 60, 'feature-map': 2, 'definitions': 4, 'lda-classifier': 1, 'aws-lambda': 2, 'image': 32, 'goss': 1, 'natural-language-process': 124, 'difference': 5, 'h2o': 4, 'data-imputation': 16, 'mcmc': 4, 'ridge-regression': 7, 'cs231n': 1, 'ngrams': 7, 'transfer-learning': 69, 'json': 10, 'domain-adaptation': 3, 'multi-instance-learning': 2, 'binary': 26, 'proximal-svm': 1, 'rdkit': 1, 'preprocessing': 120, 'huggingface': 2, 'jupyter': 41, 'pruning': 3, 'performance': 27, 'stacked-lstm': 7, 'rstudio': 15, 'tokenization': 6, 'finetuning': 7, 'c++': 1, 'search-engine': 4, 'activation-function': 44, 'k-nn': 50, 'hierarchical-data-format': 7, 'online-learning': 13, 'scala': 9, 'methods': 4, 'ocr': 26, 'automatic-summarization': 10, 'inceptionresnetv2': 6, 'semi-supervised-learning': 18}
no_of_tags = pd.DataFrame.from_dict(num_tags, orient='index')
print(no_of_tags.head())
0 pandas 354 scalability 4 probability 76 chatbot 14 kendalls-tau-coefficient 1
times_tag_used = no_of_tags.sort_values([0])#times_tag_used is no_of_tags,
#I have just renamed it
print(times_tag_used)
0 exploitation 1 apache-nifi 1 haar-cascade 1 nl2sql 1 cs231n 1 spyder 1 multivariate-distribution 1 noisification 1 counter-inference 1 statsmodels 1 james-stein-encoder 1 state-of-the-art 1 pathfinder 1 non-convex 1 impala 1 pattern-recognition 1 summarunner-architecture 1 pgm 1 helmert-coding 1 amazon-ml 1 data-transfer 1 proximal-svm 1 knime 1 siamese 1 history 1 cause-effect-relations 1 nn 1 adaboost 1 least-squares-svm 1 hog 1 ... ... feature-engineering 163 xgboost 165 linear-regression 175 pytorch 175 data-science-model 186 reinforcement-learning 203 feature-selection 209 image-classification 211 data 213 data-mining 217 machine-learning-model 224 statistics 234 clustering 257 predictive-modeling 265 r 268 dataset 340 regression 347 pandas 354 lstm 402 time-series 466 cnn 489 nlp 493 scikit-learn 540 tensorflow 584 classification 685 keras 935 neural-network 1055 deep-learning 1220 python 1814 machine-learning 2693 [526 rows x 1 columns]
top_times_tags_used = times_tag_used.tail(25)
top_times_tags_used
0 | |
---|---|
reinforcement-learning | 203 |
feature-selection | 209 |
image-classification | 211 |
data | 213 |
data-mining | 217 |
machine-learning-model | 224 |
statistics | 234 |
clustering | 257 |
predictive-modeling | 265 |
r | 268 |
dataset | 340 |
regression | 347 |
pandas | 354 |
lstm | 402 |
time-series | 466 |
cnn | 489 |
nlp | 493 |
scikit-learn | 540 |
tensorflow | 584 |
classification | 685 |
keras | 935 |
neural-network | 1055 |
deep-learning | 1220 |
python | 1814 |
machine-learning | 2693 |
The above are the top most used tags. Let us see how this looks graphically.
import matplotlib.pyplot as plt
from numpy import arange
%magic inline
fig, ax = plt.subplots()
bar_heights = top_times_tags_used.iloc[0].values
bar_positions = arange(25) + 0.25
ax.bar(bar_positions, bar_heights, 0.5)
plt.show()
ValueErrorTraceback (most recent call last) <ipython-input-38-51e02f659295> in <module>() 6 bar_heights = top_times_tags_used.iloc[0].values 7 bar_positions = arange(25) + 0.25 ----> 8 ax.bar(bar_positions, bar_heights, 0.5) 9 10 /dataquest/system/env/python3/lib/python3.4/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs) 1810 warnings.warn(msg % (label_namer, func.__name__), 1811 RuntimeWarning, stacklevel=2) -> 1812 return func(ax, *args, **kwargs) 1813 pre_doc = inner.__doc__ 1814 if pre_doc is None: /dataquest/system/env/python3/lib/python3.4/site-packages/matplotlib/axes/_axes.py in bar(self, left, height, width, bottom, **kwargs) 2078 if len(height) != nbars: 2079 raise ValueError("incompatible sizes: argument 'height' " -> 2080 "must be length %d or scalar" % nbars) 2081 if len(width) != nbars: 2082 raise ValueError("incompatible sizes: argument 'width' " ValueError: incompatible sizes: argument 'height' must be length 25 or scalar
tag = []
times_used = []
for row in top_times_tags_used:
tag.append(row[0])
times_used.append(row[1])
plt.barh(top_times_tags_used[0], top_times_tags_used[1])
plt.title("Times Each Tag was Used")
plt.xlabel("Times Tag Used")
num_views = {}
for views in questions['ViewCount']:
if views in num_views:
num_views[views] += 1
else:
num_views[views] = 1
print(num_views)
{2: 1, 3: 3, 4: 14, 5: 29, 6: 49, 7: 60, 8: 82, 9: 107, 10: 111, 11: 120, 12: 124, 13: 166, 14: 165, 15: 157, 16: 153, 17: 162, 18: 169, 19: 168, 20: 168, 21: 164, 22: 165, 23: 149, 24: 132, 25: 149, 26: 136, 27: 138, 28: 145, 29: 139, 30: 142, 31: 111, 32: 124, 33: 102, 34: 123, 35: 97, 36: 108, 37: 107, 38: 71, 39: 101, 40: 72, 41: 98, 42: 95, 43: 84, 44: 64, 45: 56, 46: 66, 47: 46, 48: 64, 49: 48, 50: 52, 51: 67, 52: 54, 53: 50, 54: 60, 55: 60, 56: 55, 57: 43, 58: 48, 59: 42, 60: 38, 61: 44, 62: 36, 63: 49, 64: 30, 65: 33, 66: 31, 67: 34, 68: 20, 69: 39, 70: 31, 71: 32, 72: 38, 73: 30, 74: 26, 75: 29, 76: 24, 77: 29, 78: 24, 79: 14, 80: 22, 81: 25, 82: 19, 83: 24, 84: 26, 85: 15, 86: 19, 87: 22, 88: 21, 89: 14, 90: 20, 91: 18, 92: 26, 93: 24, 94: 17, 95: 15, 96: 16, 97: 9, 98: 16, 99: 24, 100: 19, 101: 19, 102: 7, 103: 14, 104: 19, 105: 17, 106: 16, 107: 10, 108: 15, 109: 14, 110: 15, 111: 6, 112: 13, 113: 12, 114: 16, 115: 8, 116: 12, 117: 14, 118: 11, 119: 14, 120: 11, 121: 3, 122: 7, 123: 18, 124: 15, 125: 10, 126: 9, 127: 12, 128: 12, 129: 14, 130: 12, 131: 10, 132: 10, 133: 11, 134: 12, 135: 10, 136: 11, 137: 6, 138: 8, 139: 9, 140: 11, 141: 11, 142: 9, 143: 9, 144: 9, 145: 5, 146: 10, 147: 12, 148: 10, 149: 9, 150: 7, 151: 7, 152: 10, 153: 4, 154: 11, 155: 7, 156: 6, 157: 12, 158: 8, 159: 12, 160: 7, 161: 10, 162: 10, 163: 12, 164: 11, 165: 2, 166: 7, 167: 9, 168: 5, 169: 6, 170: 9, 171: 14, 172: 6, 173: 7, 174: 4, 2077: 1, 176: 6, 177: 8, 178: 6, 179: 7, 180: 3, 181: 6, 182: 3, 183: 6, 184: 5, 185: 3, 186: 4, 187: 8, 188: 3, 189: 6, 190: 8, 191: 6, 192: 6, 193: 10, 194: 5, 195: 7, 196: 7, 197: 3, 198: 5, 4295: 1, 200: 9, 201: 6, 202: 8, 203: 3, 204: 6, 205: 2, 206: 6, 207: 1, 208: 2, 209: 4, 210: 6, 211: 3, 212: 6, 213: 3, 214: 4, 215: 5, 216: 6, 217: 3, 218: 3, 219: 2, 220: 2, 221: 1, 222: 6, 223: 4, 224: 2, 225: 6, 226: 3, 227: 6, 229: 7, 230: 7, 231: 6, 232: 5, 233: 2, 234: 8, 235: 7, 236: 8, 237: 5, 238: 3, 239: 3, 240: 4, 241: 2, 242: 8, 243: 3, 244: 5, 245: 2, 246: 4, 247: 5, 248: 3, 249: 6, 250: 5, 251: 1, 252: 3, 253: 4, 254: 1, 255: 8, 256: 4, 257: 2, 258: 4, 259: 1, 260: 5, 261: 3, 262: 4, 263: 2, 264: 7, 265: 4, 266: 7, 267: 3, 268: 4, 269: 2, 270: 2, 271: 1, 272: 3, 273: 2, 274: 7, 275: 3, 276: 1, 277: 4, 278: 2, 279: 4, 2328: 1, 281: 2, 282: 3, 283: 9, 284: 3, 285: 2, 286: 3, 287: 4, 288: 6, 2337: 1, 290: 3, 291: 3, 292: 2, 293: 3, 294: 1, 295: 2, 296: 1, 297: 1, 298: 9, 299: 3, 300: 5, 301: 1, 302: 2, 303: 2, 304: 3, 305: 2, 306: 3, 307: 2, 308: 1, 309: 2, 310: 3, 311: 2, 312: 2, 313: 6, 4410: 1, 315: 1, 316: 2, 317: 4, 319: 2, 2369: 1, 322: 2, 324: 4, 327: 1, 328: 3, 329: 1, 330: 1, 331: 1, 332: 2, 334: 2, 335: 2, 336: 2, 337: 5, 339: 2, 341: 2, 342: 3, 344: 2, 345: 2, 346: 4, 347: 1, 348: 2, 349: 1, 350: 1, 351: 2, 352: 2, 353: 2, 354: 3, 355: 1, 356: 1, 10597: 1, 358: 1, 2407: 1, 360: 5, 361: 2, 362: 1, 363: 2, 364: 3, 365: 2, 366: 3, 367: 4, 368: 3, 370: 3, 373: 3, 375: 1, 376: 5, 377: 1, 378: 3, 379: 2, 380: 2, 381: 1, 382: 2, 384: 2, 490: 1, 386: 2, 387: 2, 388: 1, 389: 4, 390: 2, 392: 3, 393: 5, 394: 1, 395: 1, 396: 1, 397: 2, 398: 2, 399: 2, 400: 1, 401: 1, 402: 1, 403: 1, 405: 2, 406: 2, 408: 2, 409: 1, 410: 5, 412: 2, 4509: 1, 414: 2, 416: 2, 417: 2, 418: 2, 2467: 1, 420: 2, 421: 3, 422: 1, 423: 1, 424: 1, 425: 2, 426: 2, 428: 1, 429: 1, 413: 1, 432: 2, 6577: 1, 434: 1, 33203: 1, 437: 1, 438: 2, 2487: 1, 441: 2, 442: 4, 444: 3, 445: 2, 446: 1, 448: 4, 449: 1, 450: 1, 452: 2, 2501: 1, 454: 3, 4551: 1, 456: 4, 457: 4, 458: 1, 460: 1, 461: 3, 462: 1, 463: 3, 464: 2, 465: 1, 466: 1, 467: 1, 469: 4, 471: 1, 472: 1, 473: 1, 474: 1, 475: 1, 476: 2, 478: 1, 479: 1, 480: 1, 481: 3, 482: 2, 483: 1, 484: 1, 485: 2, 4582: 1, 357: 2, 488: 1, 489: 3, 2538: 1, 491: 3, 492: 4, 493: 1, 494: 1, 495: 1, 496: 2, 497: 2, 502: 1, 503: 1, 505: 1, 506: 1, 507: 1, 508: 1, 509: 3, 510: 1, 511: 1, 512: 2, 513: 3, 514: 1, 517: 3, 518: 2, 520: 1, 521: 2, 522: 1, 524: 2, 527: 1, 530: 1, 531: 1, 532: 1, 533: 1, 534: 1, 4136: 1, 537: 1, 538: 1, 540: 1, 543: 2, 359: 2, 550: 1, 551: 2, 552: 2, 553: 1, 554: 1, 12847: 1, 560: 1, 561: 1, 2610: 1, 563: 1, 564: 1, 565: 3, 567: 3, 574: 1, 576: 1, 577: 1, 2626: 1, 579: 1, 580: 2, 582: 2, 583: 1, 584: 3, 586: 2, 587: 1, 588: 1, 591: 2, 597: 2, 599: 3, 2648: 1, 603: 1, 607: 2, 609: 1, 610: 1, 612: 1, 613: 1, 616: 1, 625: 1, 627: 1, 628: 1, 629: 2, 632: 1, 4729: 1, 635: 2, 637: 1, 638: 1, 640: 1, 642: 2, 2155: 1, 2692: 1, 4745: 1, 650: 1, 651: 1, 653: 1, 658: 2, 659: 1, 660: 4, 661: 2, 664: 1, 665: 1, 667: 1, 1045: 1, 669: 1, 670: 2, 671: 4, 673: 3, 674: 1, 675: 1, 2724: 1, 677: 1, 679: 3, 682: 2, 683: 1, 689: 2, 691: 1, 692: 2, 693: 1, 694: 2, 2505: 1, 697: 2, 698: 1, 2069: 1, 701: 1, 702: 3, 704: 1, 708: 1, 710: 2, 801: 2, 713: 1, 714: 1, 715: 1, 718: 3, 724: 1, 726: 2, 727: 1, 728: 2, 729: 1, 2778: 1, 738: 2, 742: 1, 747: 2, 748: 2, 749: 2, 751: 1, 752: 2, 3210: 1, 2174: 1, 758: 2, 759: 1, 2412: 1, 6910: 1, 770: 1, 774: 1, 778: 1, 783: 1, 784: 1, 8977: 1, 788: 1, 789: 1, 791: 1, 793: 2, 794: 1, 796: 2, 2849: 1, 802: 2, 805: 2, 807: 1, 2857: 1, 815: 2, 2864: 1, 817: 1, 822: 1, 825: 2, 827: 1, 829: 3, 839: 2, 840: 1, 843: 1, 1165: 1, 848: 2, 849: 1, 850: 2, 851: 1, 4950: 1, 856: 1, 857: 1, 858: 1, 859: 2, 860: 2, 861: 1, 865: 1, 867: 1, 2916: 1, 486: 1, 870: 1, 876: 1, 877: 1, 711: 1, 879: 1, 11122: 1, 883: 1, 888: 2, 890: 1, 891: 1, 893: 2, 11136: 1, 898: 1, 439: 2, 903: 1, 907: 1, 908: 1, 910: 1, 916: 1, 918: 1, 924: 1, 925: 1, 927: 1, 928: 1, 930: 1, 931: 1, 932: 1, 933: 1, 937: 1, 939: 1, 5040: 1, 945: 1, 946: 1, 950: 2, 951: 1, 956: 1, 958: 1, 960: 1, 963: 1, 964: 2, 968: 2, 969: 1, 970: 1, 971: 2, 3021: 1, 984: 1, 987: 1, 989: 1, 991: 1, 993: 1, 1000: 1, 1002: 1, 1003: 1, 1005: 2, 9209: 1, 1019: 1, 1195: 1, 1029: 1, 1036: 1, 1038: 1, 3093: 1, 1047: 1, 1049: 1, 1050: 1, 175: 5, 1053: 1, 1058: 1, 1060: 1, 1061: 1, 1066: 1, 1067: 1, 1885: 1, 1073: 1, 1074: 1, 1075: 1, 1078: 1, 1079: 1, 1081: 1, 1082: 1, 8373: 1, 1089: 1, 1092: 1, 1096: 1, 1101: 1, 1105: 1, 1112: 1, 1113: 4, 1114: 1, 1115: 1, 1117: 1, 1211: 1, 1128: 1, 1129: 1, 1130: 1, 7278: 1, 1135: 1, 2579: 1, 7284: 1, 1141: 1, 1154: 1, 1155: 1, 1156: 1, 3207: 1, 1162: 1, 1164: 1, 3213: 1, 1169: 1, 1170: 1, 1178: 1, 3229: 1, 1182: 1, 1190: 1, 1191: 1, 199: 5, 1197: 2, 4296: 1, 1210: 1, 2086: 1, 1212: 2, 2250: 1, 1218: 1, 1219: 1, 1225: 1, 1228: 1, 1229: 1, 7374: 1, 547: 1, 1235: 1, 1230: 1, 1239: 1, 1251: 1, 1261: 1, 1269: 1, 1274: 2, 1283: 1, 1286: 1, 1288: 2, 1289: 1, 3341: 1, 1295: 1, 3358: 1, 453: 1, 1324: 1, 1330: 1, 1335: 1, 1338: 1, 2271: 1, 1352: 1, 1354: 1, 1358: 1, 1360: 1, 1362: 1, 1364: 2, 1370: 1, 1374: 1, 1377: 1, 1378: 1, 7523: 1, 1397: 1, 1406: 1, 1407: 1, 3458: 1, 1416: 1, 1418: 1, 3470: 1, 1423: 2, 1428: 1, 3478: 1, 28060: 1, 1444: 1, 730: 2, 1450: 1, 28079: 1, 2292: 1, 3519: 1, 7620: 1, 1477: 1, 2095: 1, 1483: 1, 1487: 1, 5587: 1, 1506: 1, 3556: 1, 1516: 1, 1518: 1, 1541: 1, 1544: 1, 1545: 1, 1557: 1, 3610: 1, 3622: 1, 5672: 1, 1577: 1, 1582: 1, 1585: 1, 3636: 1, 1592: 1, 1598: 1, 1602: 1, 6191: 1, 1615: 1, 1617: 1, 1621: 1, 1625: 1, 6328: 1, 4146: 1, 1651: 1, 1659: 1, 1666: 1, 1676: 1, 1986: 1, 280: 2, 1691: 1, 8474: 1, 5795: 1, 1704: 1, 1708: 1, 1715: 1, 2078: 1, 1752: 1, 3820: 1, 1779: 1, 1787: 1, 5894: 1, 5895: 1, 1801: 1, 3854: 1, 1810: 1, 1813: 1, 1826: 1, 3875: 1, 1828: 1, 1833: 1, 3882: 1, 1842: 1, 649: 1, 2356: 1, 8012: 1, 1871: 1, 1878: 1, 314: 2, 2364: 1, 1900: 1, 2450: 1, 1907: 1, 1914: 1, 1916: 1, 3966: 1, 1919: 2, 3970: 1, 1935: 1, 3991: 1, 1960: 1, 1966: 1, 1969: 1, 1977: 1, 1979: 1, 816: 2, 695: 1, 1987: 1, 4045: 1, 2000: 1, 2010: 1, 2044: 1}
no_of_views = pd.DataFrame.from_dict(num_views, orient='index')
no_of_views.rename(columns={0: "Times Tag Viewed"}, inplace=True)
print(no_of_views)
Times Tag Viewed 2 1 3 3 4 14 5 29 6 49 7 60 8 82 9 107 10 111 11 120 12 124 13 166 14 165 15 157 16 153 17 162 18 169 19 168 20 168 21 164 22 165 23 149 24 132 25 149 26 136 27 138 28 145 29 139 30 142 31 111 ... ... 1842 1 649 1 2356 1 8012 1 1871 1 1878 1 314 2 2364 1 1900 1 2450 1 1907 1 1914 1 1916 1 3966 1 1919 2 3970 1 1935 1 3991 1 1960 1 1966 1 1969 1 1977 1 1979 1 816 2 695 1 1987 1 4045 1 2000 1 2010 1 2044 1 [912 rows x 1 columns]