{"pages":[{"title":"About Guillaume Redoulès","text":"I am a data scientist. I founded the TechStars company New Knowledge and am also the co-host of the data science podcast, Partially Derivative . Previously, I led Ushahidi's work on crisis and humanitarian data and launched CrisisNET . Prior to Ushahidi, I was Director of the Governance Project at FrontlineSMS . I earned a Ph.D. in Political Science from the University of California, Davis researching the quantitative impact of civil wars on health care systems. In 2008, I founded Conflict Health , a blog on the defense of health and health workers in armed conflict and political violence. I also wrote for the Daily Dot , United States Naval Institute Blog , TheAtlantic.com, ForeignPolicy.com, UN Dispatch , and elsewhere. I earned a B.A. from the University of Miami, where I triple majored in political science, international studies, and religious studies. Email: cralbon@gmail.com Twitter: @chrisalbon Curriculum Vitae Education Ph.D., Political Science , University of California, Davis. 2012 Dissertation: \"Civil Wars And Health Systems\", a quantitative analysis of the determinants of rebel and government behavior towards health system destruction and reconstruction using original data Fields: International Relations, Quantitative Methodology, and Epidemiology M.A., Political Science , University of California, Davis. 2010 Thesis: \"U.N. Peace Operations And Public Health After Civil War\" B.A. , University of Miami, Miami, FL. 2006 Triple majored in political science, international studies, and religious studies Experience Co-founder & Chief Science Officer , New Knowledge , 2015 - Present In charge of everything data science and product. Co-founder & Co-host , Partially Derivative , 2014 - Present Co-founded a podcast on data and data science. Volunteer Data Scientist , DataKind , 2015 - Present Director of CrisisNET , Ushahidi , 2014 - 2015 Launched a pipeline for global humanitarian crisis data. Director of Data Projects , Ushahidi , 2013 - 2014 The non-profit's first data science hire; led all data science efforts. Project Director , FrontlineSMS , 2012 - 2013 Led FrontlineSMS's Governance Project, an effort improve the transparency and accountability of governments through mobile technology Contributor , Daily Dot , 2012 - 2013 Write opinion pieces on the politics of data and the internet Contributor , United Nations Dispatch , 2011 - 2013 Write news, opinion, and analysis on global affairs, particularly relating to health during conflict, global health politics, and the role of social media Blogger , Conflict Health , 2008 - 2012 Designed and launched blog on defending health and health workers against persecution, violence, and armed conflict Wrote 485 posts over four years Cited by major publications including The Atlantic, Harpers, Wired, The Economist, Time, The Guardian, and The American Prospect Contributor , United States Naval Institute Blog , 2009 - 2011 Wrote posts on the U.S. Navy's role in disaster relief, humanitarian assistance, and health diplomacy for one of America's most prestigious professional military associations Research Assistant , U.C. Davis Department Of Political Science , 2008 - 2009 Researched the effect of U.S. defense policy on military suicides Founder , Serve Your World 2002 - 2006 An information site on overseas volunteering","tags":"pages","url":"redoules.github.io/pages/about.html","loc":"redoules.github.io/pages/about.html"},{"title":"Saving a matplotlib figure with a high resolution","text":"creating a matplotlib figure #Importing matplotlib % matplotlib inline import matplotlib.pyplot as plt import numpy as np Drawing a figure # Fixing random state for reproducibility np . random . seed ( 19680801 ) mu , sigma = 100 , 15 x = mu + sigma * np . random . randn ( 10000 ) # the histogram of the data n , bins , patches = plt . hist ( x , 50 , normed = 1 , facecolor = 'g' , alpha = 0.75 ) plt . xlabel ( 'Smarts' ) plt . ylabel ( 'Probability' ) plt . title ( 'Histogram of IQ' ) plt . text ( 60 , . 025 , r '$\\mu=100,\\ \\sigma=15$' ) plt . axis ([ 40 , 160 , 0 , 0.03 ]) plt . grid ( True ) plt . show () Saving the figure normally, one would use the following code plt . savefig ( 'filename.png' ) The figure in then exported to the file \"filename.png\" with a standard resolution. In adittion, you can specify the dpi arg to some scalar value, for example: plt . savefig ( 'filename_hi_dpi.png' , dpi = 300 ) ","tags":"Python","url":"redoules.github.io/python/Saving_a_matplotlib_figure_with_a_high_resolution.html","loc":"redoules.github.io/python/Saving_a_matplotlib_figure_with_a_high_resolution.html"},{"title":"Iterating over a DataFrame","text":"Create a sample dataframe # Import modules import pandas as pd # Example dataframe raw_data = { 'fruit' : [ 'Banana' , 'Orange' , 'Apple' , 'lemon' , \"lime\" , \"plum\" ], 'color' : [ 'yellow' , 'orange' , 'red' , 'yellow' , \"green\" , \"purple\" ], 'kcal' : [ 89 , 47 , 52 , 15 , 30 , 28 ] } df = pd . DataFrame ( raw_data , columns = [ 'fruit' , 'color' , 'kcal' ]) df .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } fruit color kcal 0 Banana yellow 89 1 Orange orange 47 2 Apple red 52 3 lemon yellow 15 4 lime green 30 5 plum purple 28 Using the iterrows method Pandas DataFrames can return a generator with the iterrrows method. It can then be used to loop over the rows of the DataFrame for index , row in df . iterrows (): print ( \"At line {0} there is a {1} which is {2} and contains {3} kcal\" . format ( index , row [ \"fruit\" ], row [ \"color\" ], row [ \"kcal\" ])) At line 0 there is a Banana which is yellow and contains 89 kcal At line 1 there is a Orange which is orange and contains 47 kcal At line 2 there is a Apple which is red and contains 52 kcal At line 3 there is a lemon which is yellow and contains 15 kcal At line 4 there is a lime which is green and contains 30 kcal At line 5 there is a plum which is purple and contains 28 kcal","tags":"Python","url":"redoules.github.io/python/Iterating_over_a_dataframe.html","loc":"redoules.github.io/python/Iterating_over_a_dataframe.html"},{"title":"Article Recommander","text":"import pandas as pd import numpy as np % matplotlib inline Loading data and preprocessing we first learn the pickled article database. We will be cleaning it and separating the interesting articles from the uninteresting ones. df = pd . read_pickle ( './article.pkl' ) del df [ \"html\" ] del df [ \"image\" ] del df [ \"URL\" ] del df [ \"hash\" ] del df [ \"source\" ] df [ \"label\" ] = df [ \"note\" ] . apply ( lambda x : 0 if x <= 0 else 1 ) df . head ( 5 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label 0 [Danny Bradbury, Marco Santori, Adam Draper, M... -10.0 Black Market Reloaded, a black market site tha... Black Market Reloaded, a black market site tha... Black Market Reloaded back online after source... 0 1 [Emily Spaven, Stan Higgins, Emilyspaven] 1.0 The UK Home Office believes the government sho... The UK Home Office believes the government sho... Home Office: UK Should Create a Crime-Fighting... 1 2 [Pete Rizzo, Alex Batlin, Yessi Bello Perez, P... -10.0 Though lofty in its ideals, lead developer Dan... A new social messaging app is aiming to disrup... Gems Bitcoin App Lets Users Earn Money From So... 0 3 [Nermin Hajdarbegovic, Stan Higgins, Pete Rizz... 3.0 US satellite service provider DISH Network has... US satellite service provider DISH Network has... DISH Becomes World's Largest Company to Accept... 1 4 [Stan Higgins, Bailey Reutzel, Garrett Keirns,... -10.0 An unidentified 28-year-old man was robbed of ... An unidentified 28-year-old man was robbed of ... Bitcoin Stolen at Gunpoint in New York City Ro... 0 Basic statistics on the dataset let's explore the dataset and extract some numbers : * the number of article liked/disliked df [ \"label\" ] . value_counts () 0 879 1 324 Name: label, dtype: int64 Create the full content column df [ 'full_content' ] = df . titre + ' ' + df . resume #exclude the full texte of the article for the moment df . head ( 1 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label full_content 0 [Danny Bradbury, Marco Santori, Adam Draper, M... -10.0 Black Market Reloaded, a black market site tha... Black Market Reloaded, a black market site tha... Black Market Reloaded back online after source... 0 Black Market Reloaded back online after source... from sklearn.model_selection import train_test_split training , testing = train_test_split ( df , # The dataset we want to split train_size = 0.75 , # The proportional size of our training set stratify = df . label , # The labels are used for stratification random_state = 400 # Use the same random state for reproducibility ) training . head ( 5 ) .dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; } authors note resume texte titre label full_content 748 [Jon Brodkin] -10.0 Amazon, Reddit, Mozilla, and other Internet co... Amazon, Reddit, Mozilla, and other Internet co... Amazon and Reddit try to save net neutrality r... 0 Amazon and Reddit try to save net neutrality r... 1183 [Jon Brodkin] -10.0 (The Time Warner involved in this transaction ... A group of mostly Democratic senators led by A... Democrats urge Trump administration to block A... 0 Democrats urge Trump administration to block A... 769 [Joseph Brogan] -10.0 On Twitter, bad news comes at all hours, with ... On Twitter, bad news comes at all hours, with ... Some of the best art on Twitter comes from the... 0 Some of the best art on Twitter comes from the... 57 [Michael Del Castillo, Pete Rizzo, Trond Vidar... -10.0 Publicly traded online travel service Webjet i... Publicly traded online travel service Webjet i... Webjet Ethereum Pilot Targets Hotel Industry's... 0 Webjet Ethereum Pilot Targets Hotel Industry's... 892 [Andrew Cunningham] 10.0 What has changed on the 2017 MacBook, then?\\nI... Andrew Cunningham\\n\\nAndrew Cunningham\\n\\nAndr... Mini-review: The 2017 MacBook could actually b... 1 Mini-review: The 2017 MacBook could actually b... from sklearn.feature_extraction.text import TfidfVectorizer , CountVectorizer from sklearn.svm import LinearSVC , SVC from sklearn.pipeline import Pipeline from sklearn.model_selection import cross_val_predict from utils.plotting import pipeline_performance steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , LinearSVC ()) ) pipeline = Pipeline ( steps ) predicted_labels = cross_val_predict ( pipeline , training . full_content , training . label ) pipeline_performance ( training . label , predicted_labels ) pipeline = pipeline . fit ( training . titre , training . label ) Accuracy = 80.6% Confusion matrix, without normalization [[624 35] [140 103]] import re from utils.plotting import print_top_features from sklearn.model_selection import GridSearchCV def mask_integers ( s ): return re . sub ( r '\\d+' , 'INTMASK' , s ) steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , LinearSVC ()) ) pipeline = Pipeline ( steps ) gs_params = { #'vectorizer__use_idf': (True, False), 'vectorizer__lowercase' : [ True , False ], 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__C' : np . linspace ( 5 , 20 , 25 ) } gs = GridSearchCV ( pipeline , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline1 = gs . best_estimator_ predicted_labels = pipeline1 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline1 , n_features = 10 ) aaa = gs . predict ( testing . full_content ) == testing . label aaa = aaa [ testing . label == 1 ] testing [ \"titre\" ] . iloc [ ~ aaa . values ] #pipeline1.predict([\"windows xbox bitcoin\"]) from sklearn.externals import joblib joblib . dump ( pipeline1 , 'classifier.pkl' ) gs . predict ([ 'Google' ]) array([1], dtype=int64) steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , SVC ()) ) pipeline = Pipeline ( steps ) gs_params = { #'vectorizer__use_idf': (True, False), 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__C' : np . linspace ( 5 , 20 , 25 ) } gs = GridSearchCV ( pipeline , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline1 = gs . best_estimator_ predicted_labels = pipeline1 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline1 , n_features = 10 ) {'classifier__C': 5.0, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.711180124224 Accuracy = 71.2% Confusion matrix, without normalization [[153 0] [ 62 0]] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 25 pipeline_performance(testing.label, predicted_labels) 26 ---> 27 print_top_features(pipeline1, n_features=10) C:\\Users\\Guillaume\\Documents\\Code\\recommandation\\utils\\plotting.py in print_top_features(pipeline, vectorizer_name, classifier_name, n_features) 81 def print_top_features(pipeline, vectorizer_name='vectorizer', classifier_name='classifier', n_features=7): 82 vocabulary = np.array(pipeline.named_steps[vectorizer_name].get_feature_names()) ---> 83 coefs = pipeline.named_steps[classifier_name].coef_[0] 84 top_feature_idx = np.argsort(coefs) 85 top_features = vocabulary[top_feature_idx] C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\svm\\base.py in coef_(self) 483 def coef_(self): 484 if self.kernel != 'linear': --> 485 raise ValueError('coef_ is only available when using a ' 486 'linear kernel') 487 ValueError: coef_ is only available when using a linear kernel from sklearn.naive_bayes import BernoulliNB steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , BernoulliNB ()) ) pipeline2 = Pipeline ( steps ) gs_params = { 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__alpha' : np . linspace ( 0 , 1 , 5 ), 'classifier__fit_prior' : [ True , False ] } gs = GridSearchCV ( pipeline2 , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline2 = gs . best_estimator_ predicted_labels = pipeline2 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline2 , n_features = 10 ) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:820: RuntimeWarning: divide by zero encountered in log neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:823: RuntimeWarning: invalid value encountered in add jll += self.class_log_prior_ + neg_prob.sum(axis=1) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:801: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - {'classifier__alpha': 0.25, 'classifier__fit_prior': True, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.805900621118 Accuracy = 78.1% Confusion matrix, without normalization [[140 13] [ 34 28]] Top like features: ['use' 'just' 'year' 'price' 'time' 'Bitcoin' 'bitcoin' 'new' 'The' 'INTMASK'] --- Top dislike features: ['ABBA' 'cable' 'cab' 'byte' 'publication' 'bye' 'publications' 'publicity' 'buyer' 'publicizing'] from sklearn.naive_bayes import MultinomialNB steps = ( ( 'vectorizer' , TfidfVectorizer ()), ( 'classifier' , MultinomialNB ()) ) pipeline3 = Pipeline ( steps ) gs_params = { 'vectorizer__stop_words' : [ 'english' , None ], 'vectorizer__ngram_range' : [( 1 , 1 ), ( 1 , 2 ), ( 2 , 2 )], 'vectorizer__preprocessor' : [ mask_integers , None ], 'classifier__alpha' : np . linspace ( 0 , 1 , 5 ), 'classifier__fit_prior' : [ True , False ] } gs = GridSearchCV ( pipeline3 , gs_params , n_jobs = 1 ) gs . fit ( training . full_content , training . label ) print ( gs . best_params_ ) print ( gs . best_score_ ) pipeline3 = gs . best_estimator_ predicted_labels = pipeline3 . predict ( testing . full_content ) pipeline_performance ( testing . label , predicted_labels ) print_top_features ( pipeline3 , n_features = 10 ) C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - C:\\Users\\Guillaume\\Anaconda3\\lib\\site-packages\\sklearn\\naive_bayes.py:699: RuntimeWarning: divide by zero encountered in log self.feature_log_prob_ = (np.log(smoothed_fc) - {'classifier__alpha': 0.5, 'classifier__fit_prior': False, 'vectorizer__ngram_range': (1, 1), 'vectorizer__preprocessor': , 'vectorizer__stop_words': 'english'} 0.80900621118 Accuracy = 79.1% Confusion matrix, without normalization [[141 12] [ 33 29]] Top like features: ['time' 'Google' 'Pro' 'Apple' 'new' 'The' 'Bitcoin' 'price' 'bitcoin' 'INTMASK'] --- Top dislike features: ['ABBA' 'categories' 'catching' 'catalyst' 'catalog' 'casually' 'casts' 'cast' 'cashier' 'ran']","tags":"Machine Learning","url":"redoules.github.io/machine-learning/Source code for the recommandation engine for articles.html","loc":"redoules.github.io/machine-learning/Source code for the recommandation engine for articles.html"},{"title":"Bash template article","text":"","tags":"Bash","url":"redoules.github.io/bash/sample article.html","loc":"redoules.github.io/bash/sample article.html"}]}