{"pages":[{"title":"About Guillaume Redoulès","text":"I am a data scientist and a mechanical engineer working on numerical methods for stress computations in the field of rocket propulsion. Prior to that, I earned a MSc in Computational Fluid Dynamics and aerodynamics from Imperial College London. Email: guillaume.redoules@gadz.org Linkedin: Guillaume Redoulès Curriculum Vitae Experience Thermomecanical method and tools engineer , Ariane Group , 2015 - Present In charge of tools and methods related to thermomecanical computations. Focal point for machine learning. Education MSc Advanced Computational Methods for Aeronautics, Flow Management and Fluid-Structure Interaction , Imperial College London, London. 2013 Dissertation: \"Estimator design for fluid flows\" Fields: Aeronautics, aerodynamics, computational fluid dynamics, numerical methods Arts et Métiers Paristech , France, 2011 Generalist engineering degree Fields: Mechanics, electrical engineering, casting, machining, project management, finance, IT, etc.","tags":"pages","url":"redoules.github.io/pages/about.html","loc":"redoules.github.io/pages/about.html"},{"title":"Saving a matplotlib figure with a high resolution","text":"creating a matplotlib figure #Importing matplotlib % matplotlib inline import matplotlib.pyplot as plt import numpy as np Drawing a figure # Fixing random state for reproducibility np . random . seed ( 19680801 ) mu , sigma = 100 , 15 x = mu + sigma * np . random . randn ( 10000 ) # the histogram of the data n , bins , patches = plt . hist ( x , 50 , normed = 1 , facecolor = 'g' , alpha = 0.75 ) plt . xlabel ( 'Smarts' ) plt . ylabel ( 'Probability' ) plt . title ( 'Histogram of IQ' ) plt . text ( 60 , . 025 , r '$\\mu=100,\\ \\sigma=15$' ) plt . axis ([ 40 , 160 , 0 , 0.03 ]) plt . grid ( True ) plt . show () Saving the figure normally, one would use the following code plt . savefig ( 'filename.png' ) The figure in then exported to the file \"filename.png\" with a standard resolution. In adittion, you can specify the dpi arg to some scalar value, for example: plt . savefig ( 'filename_hi_dpi.png' , dpi = 300 ) ","tags":"Python","url":"redoules.github.io/python/Saving_a_matplotlib_figure_with_a_high_resolution.html","loc":"redoules.github.io/python/Saving_a_matplotlib_figure_with_a_high_resolution.html"},{"title":"Iterating over a DataFrame","text":"Create a sample dataframe # Import modules import pandas as pd # Example dataframe raw_data = { 'fruit' : [ 'Banana' , 'Orange' , 'Apple' , 'lemon' , \"lime\" , \"plum\" ], 'color' : [ 'yellow' , 'orange' , 'red' , 'yellow' , \"green\" , \"purple\" ], 'kcal' : [ 89 , 47 , 52 , 15 , 30 , 28 ] } df = pd . DataFrame ( raw_data , columns = [ 'fruit' , 'color' , 'kcal' ]) df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } fruit color kcal 0 Banana yellow 89 1 Orange orange 47 2 Apple red 52 3 lemon yellow 15 4 lime green 30 5 plum purple 28 Using the iterrows method Pandas DataFrames can return a generator with the iterrrows method. It can then be used to loop over the rows of the DataFrame for index , row in df . iterrows (): print ( \"At line {0} there is a {1} which is {2} and contains {3} kcal\" . format ( index , row [ \"fruit\" ], row [ \"color\" ], row [ \"kcal\" ])) At line 0 there is a Banana which is yellow and contains 89 kcal At line 1 there is a Orange which is orange and contains 47 kcal At line 2 there is a Apple which is red and contains 52 kcal At line 3 there is a lemon which is yellow and contains 15 kcal At line 4 there is a lime which is green and contains 30 kcal At line 5 there is a plum which is purple and contains 28 kcal","tags":"Python","url":"redoules.github.io/python/Iterating_over_a_dataframe.html","loc":"redoules.github.io/python/Iterating_over_a_dataframe.html"},{"title":"Article Recommander","text":"import pandas as pd import numpy as np % matplotlib inline Loading data and preprocessing we first learn the pickled article database. We will be cleaning it and separating the interesting articles from the uninteresting ones. df = pd . read_pickle ( './article.pkl' ) del df [ \"html\" ] del df [ \"image\" ] del df [ \"URL\" ] del df [ \"hash\" ] del df [ \"source\" ] df [ \"label\" ] = df [ \"note\" ] . apply ( lambda x : 0 if x <= 0 else 1 ) df . head ( 5 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label 0 [Danny Bradbury, Marco Santori, Adam Draper, M... -10.0 Black Market Reloaded, a black market site tha... Black Market Reloaded, a black market site tha... Black Market Reloaded back online after source... 0 1 [Emily Spaven, Stan Higgins, Emilyspaven] 1.0 The UK Home Office believes the government sho... The UK Home Office believes the government sho... Home Office: UK Should Create a Crime-Fighting... 1 2 [Pete Rizzo, Alex Batlin, Yessi Bello Perez, P... -10.0 Though lofty in its ideals, lead developer Dan... A new social messaging app is aiming to disrup... Gems Bitcoin App Lets Users Earn Money From So... 0 3 [Nermin Hajdarbegovic, Stan Higgins, Pete Rizz... 3.0 US satellite service provider DISH Network has... US satellite service provider DISH Network has... DISH Becomes World's Largest Company to Accept... 1 4 [Stan Higgins, Bailey Reutzel, Garrett Keirns,... -10.0 An unidentified 28-year-old man was robbed of ... An unidentified 28-year-old man was robbed of ... Bitcoin Stolen at Gunpoint in New York City Ro... 0 Basic statistics on the dataset let's explore the dataset and extract some numbers : * the number of article liked/disliked df [ \"label\" ] . value_counts () 0 879 1 324 Name: label, dtype: int64 Create the full content column df [ 'full_content' ] = df . titre + ' ' + df . resume #exclude the full texte of the article for the moment df . head ( 1 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label full_content 0 [Danny Bradbury, Marco Santori, Adam Draper, M... -10.0 Black Market Reloaded, a black market site tha... Black Market Reloaded, a black market site tha... Black Market Reloaded back online after source... 0 Black Market Reloaded back online after source... from sklearn.model_selection import train_test_split training , testing = train_test_split ( df , # The dataset we want to split train_size = 0.75 , # The proportional size of our training set stratify = df . label , # The labels are used for stratification random_state = 400 # Use the same random state for reproducibility ) training . head ( 5 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label full_content 748 [Jon Brodkin] -10.0 Amazon, Reddit, Mozilla, and other Internet co... Amazon, Reddit, Mozilla, and other Internet co... Amazon and Reddit try to save net neutrality r... 0 Amazon and Reddit try to save net neutrality r... 1183 [Jon Brodkin] -10.0 (The Time Warner involved in this transaction ... A group of mostly Democratic senators led by A... Democrats urge Trump administration to block A... 0 Democrats urge Trump administration to block A... 769 [Joseph Brogan] -10.0 On Twitter, bad news comes at all hours, with ... On Twitter, bad news comes at all hours, with ... Some of the best art on Twitter comes from the... 0 Some of the best art on Twitter comes from the... 57 [Michael Del Castillo, Pete Rizzo, Trond Vidar... -10.0 Publicly traded online travel service Webjet i... Publicly traded online travel service Webjet i... Webjet Ethereum Pilot Targets Hotel Industry's... 0 Webjet Ethereum Pilot Targets Hotel Industry's... 892 [Andrew Cunningham] 10.0 What has changed on the 2017 MacBook, then?\\nI... Andrew Cunningham\\n\\nAndrew Cunningham\\n\\nAndr... Mini-review: The 2017 MacBook could actually b... 1 Mini-review: The 2017 MacBook could actually b... from sklearn.feature_extraction.text import TfidfVectorizer , CountVectorizer from sklearn.svm import LinearSVC , SVC from sklearn.pipeline import Pipeline from sklearn.model_selection import cross_val_predict from utils.plotting import pipeline_performance steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , LinearSVC ()) ) pipeline = Pipeline ( steps ) predicted_labels = cross_val_predict ( pipeline , training . full_content , training . label ) pipeline_performance ( training . label , predicted_labels ) pipeline = pipeline . fit ( training . titre , training . label ) Accuracy = 80.6% Confusion matrix, without normalization [[624 35] [140 103]] import re from utils.plotting import print_top_features from sklearn.model_selection import GridSearchCV def mask_integers ( s ): return re . sub ( r '\\d+' , 'INTMASK' , s ) steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , LinearSVC ()) ) pipeline = Pipeline ( steps ) gs_params = { #'vectorizer__use_idf': (True, False), 'vectorizer__lowercase' : [ True , False ], 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__C' : np . linspace ( 5 , 20 , 25 ) } gs = GridSearchCV ( pipeline , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline1 = gs . best_estimator_ predicted_labels = pipeline1 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline1 , n_features = 10 ) aaa = gs . predict ( testing . full_content ) == testing . label aaa = aaa [ testing . label == 1 ] testing [ \"titre\" ] . iloc [ ~ aaa . values ] #pipeline1.predict([\"windows xbox bitcoin\"]) from sklearn.externals import joblib joblib . dump ( pipeline1 , 'classifier.pkl' ) gs . predict ([ 'Google' ]) array([1], dtype=int64) steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , SVC ()) ) pipeline = Pipeline ( steps ) gs_params = { #'vectorizer__use_idf': (True, False), 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__C' : np . linspace ( 5 , 20 , 25 ) } gs = GridSearchCV ( pipeline , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline1 = gs . best_estimator_ predicted_labels = pipeline1 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline1 , n_features = 10 ) {'classifier__C': 5.0, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.711180124224 Accuracy = 71.2% Confusion matrix, without normalization [[153 0] [ 62 0]] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 25 pipeline_performance(testing.label, predicted_labels) 26 ---> 27 print_top_features(pipeline1, n_features=10) C:\\Users\\Guillaume\\Documents\\Code\\recommandation\\utils\\plotting.py in print_top_features(pipeline, vectorizer_name, classifier_name, n_features) 81 def print_top_features(pipeline, vectorizer_name='vectorizer', classifier_name='classifier', n_features=7): 82 vocabulary = np.array(pipeline.named_steps[vectorizer_name].get_feature_names()) ---> 83 coefs = pipeline.named_steps[classifier_name].coef_[0] 84 top_feature_idx = np.argsort(coefs) 85 top_features = vocabulary[top_feature_idx] C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\svm\\base.py in coef_(self) 483 def coef_(self): 484 if self.kernel != 'linear': --> 485 raise ValueError('coef_ is only available when using a ' 486 'linear kernel') 487 ValueError: coef_ is only available when using a linear kernel from sklearn.naive_bayes import BernoulliNB steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , BernoulliNB ()) ) pipeline2 = Pipeline ( steps ) gs_params = { 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__alpha' : np . linspace ( 0 , 1 , 5 ), 'classifier__fit_prior' : [ True , False ] } gs = GridSearchCV ( pipeline2 , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline2 = gs . best_estimator_ predicted_labels = pipeline2 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline2 , n_features = 10 ) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - {'classifier__alpha': 0.25, 'classifier__fit_prior': True, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.805900621118 Accuracy = 78.1% Confusion matrix, without normalization [[140 13] [ 34 28]] Top like features: ['use' 'just' 'year' 'price' 'time' 'Bitcoin' 'bitcoin' 'new' 'The' 'INTMASK'] --- Top dislike features: ['ABBA' 'cable' 'cab' 'byte' 'publication' 'bye' 'publications' 'publicity' 'buyer' 'publicizing'] from sklearn.naive_bayes import MultinomialNB steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , MultinomialNB ()) ) pipeline3 = Pipeline ( steps ) gs_params = { 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__alpha' : np . linspace ( 0 , 1 , 5 ), 'classifier__fit_prior' : [ True , False ] } gs = GridSearchCV ( pipeline3 , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline3 = gs . best_estimator_ predicted_labels = pipeline3 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline3 , n_features = 10 ) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - {'classifier__alpha': 0.5, 'classifier__fit_prior': False, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.80900621118 Accuracy = 79.1% Confusion matrix, without normalization [[141 12] [ 33 29]] Top like features: ['time' 'Google' 'Pro' 'Apple' 'new' 'The' 'Bitcoin' 'price' 'bitcoin' 'INTMASK'] --- Top dislike features: ['ABBA' 'categories' 'catching' 'catalyst' 'catalog' 'casually' 'casts' 'cast' 'cashier' 'ran']","tags":"Machine Learning","url":"redoules.github.io/machine-learning/Source code for the recommandation engine for articles.html","loc":"redoules.github.io/machine-learning/Source code for the recommandation engine for articles.html"}]}