質問編集履歴

2

markdownの変更

2018/05/24 23:43

投稿

bouyomisan
bouyomisan

スコア87

test CHANGED
File without changes
test CHANGED
@@ -10,7 +10,7 @@
10
10
 
11
11
 
12
12
 
13
- '''
13
+ ```
14
14
 
15
15
  from sklearn.pipeline import Pipeline
16
16
 
@@ -76,4 +76,4 @@
76
76
 
77
77
 
78
78
 
79
- '''
79
+ ```

1

sdfs

2018/05/24 23:43

投稿

bouyomisan
bouyomisan

スコア87

test CHANGED
File without changes
test CHANGED
@@ -3,3 +3,77 @@
3
3
 
4
4
 
5
5
  アウトオブコアの実装内容が理解できなかったのですが、なぜこのように高速に学習を終えるのかを教えてください。
6
+
7
+
8
+
9
+
10
+
11
+
12
+
13
+ '''
14
+
15
+ from sklearn.pipeline import Pipeline
16
+
17
+ from sklearn.linear_model import LogisticRegression
18
+
19
+ from sklearn.feature_extraction.text import TfidfVectorizer
20
+
21
+ from sklearn.model_selection import GridSearchCV
22
+
23
+
24
+
25
+ tfidf = TfidfVectorizer(strip_accents=None,
26
+
27
+ lowercase=False,
28
+
29
+ preprocessor=None)
30
+
31
+
32
+
33
+ param_grid = [{'vect__ngram_range': [(1, 1)],
34
+
35
+ 'vect__stop_words': [stop, None],
36
+
37
+ 'vect__tokenizer': [tokenizer, tokenizer_porter],
38
+
39
+ 'clf__penalty': ['l1', 'l2'],
40
+
41
+ 'clf__C': [1.0, 10.0, 100.0]},
42
+
43
+ {'vect__ngram_range': [(1, 1)],
44
+
45
+ 'vect__stop_words': [stop, None],
46
+
47
+ 'vect__tokenizer': [tokenizer, tokenizer_porter],
48
+
49
+ 'vect__use_idf':[False],
50
+
51
+ 'vect__norm':[None],
52
+
53
+ 'clf__penalty': ['l1', 'l2'],
54
+
55
+ 'clf__C': [1.0, 10.0, 100.0]},
56
+
57
+ ]
58
+
59
+
60
+
61
+ lr_tfidf = Pipeline([('vect', tfidf),
62
+
63
+ ('clf', LogisticRegression(random_state=0))])
64
+
65
+
66
+
67
+ gs_lr_tfidf = GridSearchCV(lr_tfidf, param_grid,
68
+
69
+ scoring='accuracy',
70
+
71
+ cv=5,
72
+
73
+ verbose=1,
74
+
75
+ n_jobs=-1)
76
+
77
+
78
+
79
+ '''