您好,登录后才能下订单哦!
密码登录
登录注册
点击 登录注册 即表示同意《亿速云用户服务条款》
# pip install bs4
from bs4 import BeautifulSoup # python 爬虫利器
"""
Beautiful Soup 是一个可以从HTML或XML文件中提取数据的Python库.
它能够通过你喜欢的转换器实现惯用的文档导航,查找,
修改文档的方式.Beautiful Soup会帮你节省数小时甚至数天的工作时间.
"""
import requests
blog_url = 'https://blog.51cto.com/13118411/2154806'
data = requests.get(blog_url)
print(data)
print(data.text)
<Response [200]>
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link type="favicon" rel="shortcut icon" href="/favicon.ico" />
<title>天气预报定制-cooperfang的博客-51CTO博客</title>
<meta name="keywords" content="requests,json">
<meta name="description" content="#apiaplicationprogramminginterface#不通软件不同系统之间的功能相互调用#json是其中重要的一种数据交换形式#定制天气预报https://www.sojson.com/open/api/weather/json.shtml?city=#http://jsonviewer.stack.hu/#https://www.sojson.com/open/api/weath"> <link href="https://static1.51cto.com/edu/blog/css/header.css?v=1.0.5.1" rel="stylesheet"><link href="https://static1.51cto.com/edu/blog/css/other.css?v=1.0.3.2" rel="stylesheet"><link href="https://static1.51cto.com/edu/blog/css/right.css?v=1.0.4.7" rel="stylesheet"><link href="https://static1.51cto.com/edu/blog/css/blog_details.css?v=1.0.7.1" rel="stylesheet"><link href="https://static1.51cto.com/edu/blog/css/highlight.css" rel="stylesheet">
<script src="https://static1.51cto.com/edu/center/js/jquery.min.js"></script><script src="https://static1.51cto.com/edu/blog/js/cookie.js"></script><script src="https://static1.51cto.com/edu/blog/js/login.js?v=1.0.0.6"></script><script src="https://static1.51cto.com/edu/blog/js/common.js?v=1.0.0.8"></script><script src="https://static1.51cto.com/edu/blog/js/mbox.js"></script><script src="https://static1.51cto.com/edu/blog/js/follow.js?v=1.0.0.8"></script><script src="https://static1.51cto.com/edu/blog/js/vip.js?v=1.0.0.1"></script></head>
<body>
<img src="https://cache.yisu.com/upload/information/20200310/57/121489.jpg" border="0" >
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
<![endif]-->
<div class="Header">
<div class="Page ">
<h2 class="Logo"><a href="https://blog.51cto.com/">Logo</a></h2>
<ul class="Navigates fl">
<li ><a href="https://blog.51cto.com/">首页</a></li>
<li ><a href="https://blog.51cto.com/original">文章</a></li>
<li ><a href="https://blog.51cto.com/blog/follow">关注</a></li>
<li class="">
<a class="column-stress" href="https://blog.51cto.com/cloumn/index">订阅专栏<b></b></a>
</li>
<li class="">
<a href="https://blog.51cto.com/expert">专家</a>
</li>
</ul>
<ul class="Navigates Navigates-right fr">
<li class="more maps">
<a href="javascript:void(0);">网站导航</a>
<div>
<a href="http://edu.51cto.com" target="_blank">学院</a>
<a href="https://blog.51cto.com" target="_blank">博客</a>
<a href="http://down.51cto.com" target="_blank">下载</a>
<a href="http://home.51cto.com" target="_blank">家园</a>
<a href="http://bbs.51cto.com" target="_blank">论坛</a>
<a href="http://x.51cto.com" target="_blank">CTO训练营</a>
<a href=" http://club.51cto.com?blog" target="_blank">CTO俱乐部</a>
<a href="http://wot.51cto.com" target="_blank">WOT</a>
<a href="http://www.51cto.com" target="_blank">51CTO</a>
<i class="arrow"></i>
</div>
</li>
<li><a href="http://home.51cto.com/user/register?reback=http%253A%252F%252Fblog.51cto.com%252F13118411%252F2154806" target="_self">注册</a></li>
<li class="login"><a href="/user/login?reback=http%3A%2F%2Fblog.51cto.com%2F13118411%2F2154806" target="_self">登录</a></li>
<li class="mRead">
<a href="javascript:;">手机阅读</a>
<div>
<img src="https://cache.yisu.com/upload/information/20200310/57/121490.jpg">
<p>扫一扫体验手机阅读</p>
<i class="arrow"></i>
</div>
</li>
<li class="search"><a href="https://blog.51cto.com/search/index" target="_self">搜索</a></li>
<li class="write"><a href="javascript:;" onClick="Login()">写文章</a></li>
</ul>
<div class="clear"></div>
</div>
</div>
<script>
var isLogin = '0';
var userId = '';
var imgpath = 'https://s1.51cto.com/';
var BLOG_URL = 'https://blog.51cto.com/';
var msg_num_url = '/index/ajax-msg-num';
$('.msg-follow, .msg-follow-max').eq(1).css({top: '91px'});
$('.msg-follow, .msg-follow-max').eq(2).css({top: '121px'});
setTimeout(function(){
$.ajax({
url:msg_num_url,
type:"get",
dataType:'json',
success:function(res){
if(res.status == '0'){
//
var hasNewMsg = false;
if(res.data.msgNum > 0 && !$('#myMsg i').hasClass('dot')){
$('#myMsg i').addClass('dot');
hasNewMsg = true;
}
if(res.data.notifyNum > 0 && !$('#myNotify i').hasClass('dot')){
$('#myNotify i').addClass('dot');
hasNewMsg = true;
}
if(res.data.recommend_new > 0 && !$('#myRecommend i').hasClass('dot')){
$('#myRecommend i').addClass('dot');
hasNewMsg = true;
}
if(hasNewMsg && !$('#myAllMsg i').hasClass('dot')){
$('#myAllMsg i').addClass('dot');
}
}
}
});
},70);
</script>
<div class="Content-box">
<link rel="stylesheet" href="https://static1.51cto.com/edu/blog/css/mdeShow.css?v=1.0.0.9">
<link rel="stylesheet" href="https://static1.51cto.com/edu/blog/css/tinyscrollbar.css"/>
<script type="text/javascript" src="https://static1.51cto.com/edu/blog/js/jquery.tinyscrollbar.js"></script>
<div class="Content Index" >
<div class="Page M764">
<!-- left start -->
<div class="artical-Left-blog">
<div class="status">
<a class="tab_name original">原创</a>
</div>
<h2 class="artical-title">天气预报定制</h2>
<div class="artical-title-list">
<div class="is-vip-bg-6 fl">
<a href="https://blog.51cto.com/13118411" class="a-img" target="_blank"><img class="is-vip-img is-vip-img-4" data-uid="13108411" src="https://cache.yisu.com/upload/information/20200310/57/121491.jpg"></a>
</div>
<a href="https://blog.51cto.com/13118411" class="name fl" target="_blank">cooperfang</a>
<a class="comment comment-num fr"><font class="comment_number">0</font>人评论</a>
<span class="fr"></span>
<a href="javascript:;" class="read fr">124人阅读</a>
<a href="javascript:;" class="time fr">2018-08-04 22:59:05</a>
<div class="clear"></div>
</div>
<div class="artical-content-bak main-content">
<div class="con artical-content editor-preview-side"><pre><code class="language-python"># api aplication programming interface
# 不通软件不同系统之间的功能相互调用
# json是其中重要的一种数据交换形式
# 定制天气预报 https://www.sojson.com/open/api/weather/json.shtml?city=
# http://jsonviewer.stack.hu/
# https://www.sojson.com/open/api/weather/json.shtml
?city=%E5%8C%97%E4%BA%AC</code></pre>
<pre><code class="language-python">import requests # pip install requests 请求 网上api的调用形式
url = 'https://www.sojson.com/open/api/weather/json.shtml?city='
city = '北京'
ret = requests.get(url + city) # 请求的对象
print(ret.json())</code></pre>
<pre><code>{'date': '20180804', 'message': 'Success !', 'status': 200, 'city': '北京', 'count': 9, 'data': {'shidu': '70%', 'pm25': 44.0, 'pm10': 78.0, 'quality': '良', 'wendu': '30', 'ganmao': '极少数敏感人群应减少户外活动', 'yesterday': {'date': '03日星期五', 'sunrise': '05:13', 'high': '高温 36.0℃', 'low': '低温 26.0℃', 'sunset': '19:27', 'aqi': 107.0, 'fx': '南风', 'fl': '<3级', 'type': '晴', 'notice': '愿你拥有比阳光明媚的心情'}, 'forecast': [{'date': '04日星期六', 'sunrise': '05:14', 'high': '高温 36.0℃', 'low': '低温 27.0℃', 'sunset': '19:26', 'aqi': 97.0, 'fx': '南风', 'fl': '<3级', 'type': '晴', 'notice': '愿你拥有比阳光明媚的心情'}, {'date': '05日星期日', 'sunrise': '05:15', 'high': '高温 35.0℃', 'low': '低温 25.0℃', 'sunset': '19:25', 'aqi': 103.0, 'fx': '东南风', 'fl': '<3级', 'type': '雷阵雨', 'notice': '带好雨具,别在树下躲雨'}, {'date': '06日星期一', 'sunrise': '05:16', 'high': '高温 31.0℃', 'low': '低温 25.0℃', 'sunset': '19:24', 'aqi': 97.0, 'fx': '南风', 'fl': '<3级', 'type': '雷阵雨', 'notice': '带好雨具,别在树下躲雨'}, {'date': '07日星期二', 'sunrise': '05:17', 'high': '高温 31.0℃', 'low': '低温 25.0℃', 'sunset': '19:22', 'aqi': 113.0, 'fx': '西南风', 'fl': '<3级', 'type': '雷阵雨', 'notice': '带好雨具,别在树下躲雨'}, {'date': '08日星期三', 'sunrise': '05:18', 'high': '高温 30.0℃', 'low': '低温 24.0℃', 'sunset': '19:21', 'aqi': 68.0, 'fx': '东南风', 'fl': '<3级', 'type': '雷阵雨', 'notice': '带好雨具,别在树下躲雨'}]}}</code></pre>
<pre><code class="language-python"># 象字典一样取值
d = ret.json()
# print(d['status'])
# print(d['city'])
# print(d['data'])
# print(d['data']['yesterday'])
def hot_weather(data):
"""定制化天气预报"""
try:
weather_list = data['data']['forecast']
# print(weather_list)
for day in weather_list:
print(day['date'], day['high'], day['low'], day['sunset'], day['notice'])
except Exception as e:
print(e)
hot_weather(d)</code></pre>
<pre><code>04日星期六 高温 36.0℃ 低温 27.0℃ 19:26 愿你拥有比阳光明媚的心情
05日星期日 高温 35.0℃ 低温 25.0℃ 19:25 带好雨具,别在树下躲雨
06日星期一 高温 31.0℃ 低温 25.0℃ 19:24 带好雨具,别在树下躲雨
07日星期二 高温 31.0℃ 低温 25.0℃ 19:22 带好雨具,别在树下躲雨
08日星期三 高温 30.0℃ 低温 24.0℃ 19:21 带好雨具,别在树下躲雨</code></pre>
<pre><code class="language-python">%cd D:\全栈\json api
d = ret.json()
import json
with open('weather.json', 'w') as f:
json.dump(d, f)</code></pre>
<pre><code>D:\全栈\json api</code></pre></div>
</div>
<div class="artical-copyright mt26">©著作权归作者所有:来自51CTO博客作者cooperfang的原创作品,如需转载,请注明出处,否则将追究法律责任</div>
<div class="for-tag mt26">
<a href="https://blog.51cto.com/search/result?q=requests" target="_blank">requests</a>
<a href="https://blog.51cto.com/search/result?q=json" target="_blank">json</a>
<div class="clear"></div>
</div>
<div class="more-list">
<p class="is-praise fl "><span type="1" blog_id="2154806" userid='13108411'>0</span></p>
<div class="share-box fr">
<p class="share"><i></i>分享</p>
<div class="bdsharebuttonbox">
<span></span>
<a class="bds_tsina" data-cmd="tsina" >微博</a>
<a class="bds_sqq" data-cmd="sqq" >QQ</a>
<a class="bds_weixin" data-cmd="weixin" >微信</a>
<img src="/qr/qr-url?url=http%3A%2F%2Fblog.51cto.com%2F13118411%2F2154806">
</div>
</div>
<p class="favorites favorites-opt fr"><i></i>收藏</p>
<div class="clear"></div>
</div>
<div class="artical-list">
<a class="fl" href="https://blog.51cto.com/13118411/2154797" title="json">
上一篇:json</a>
<div class="clear"></div>
</div>
<div class="author-module">
<div class="is-vip-bg-6 fl">
<a href="https://blog.51cto.com/13118411" class="a-img" target="_blank">
<img class="is-vip-img is-vip-img-4" data-uid="13108411" src="https://cache.yisu.com/upload/information/20200310/57/121491.jpg">
</a>
</div>
<div class="author-module-center fl">
<a class="h3" href="https://blog.51cto.com/13118411" target="_blank">cooperfang</a>
<h4>42篇文章,1W+人气,0粉丝</h4>
</div>
<div class="clear"></div>
</div>
</div>
<div class="artical-Left" id="comment">
<!-- 发布评论 -->
<div class="comment-creat">
<div class="is-vip-bg-6 fl">
<a href="https://blog.51cto.com/13118411" class="header-img" target="_blank">
<img src="https://cache.yisu.com/upload/information/20200310/57/121491.jpg"/>
</a>
</div>
<div class="first-publish fr publish_user_id">
<textarea class="textareadiv textareadiv-publish" name="" id="" placeholder="提问和评论都可以,用心的回复会被更多人看到和认可" maxlength="500"></textarea>
<div class="comment-push">
<p class="msg fl">Ctrl+Enter 发布</p>
<p class="publish-btn blue-btn fr" flag="1">发布</p>
<p class="cancel-btn cancel-btn-1 fr">取消</p>
<div class="clear"></div>
</div>
<input type="hidden" class="user_id" value="13108411">
<input type="hidden" class="reply_id" value="2154806">
<input type="hidden" class="first_pid" value="">
</div>
<div class="clear"></div>
</div>
<div class="commentList">
</div>
<!-- page -->
<div class="act_pageList_box"></div>
</div>
<!-- end left -->
<!-- right start -->
<div class="Blog-Right artical-Right">
<a class="catalog"></a>
<a class="scrollTop" href="javascript:void(0);" onclick="$(window).scrollTop(0);"></a>
</div>
<!-- end right -->
</div>
<div class="special-column">
<div class="Page M764">
<div class="column-1">
<h3 class="column-tit">推荐专栏</h3>
<div class="column-box">
<a href="https://blog.51cto.com/cloumn/detail/13" class="a-img fl cloumn-tab-par" target="_blank">
<img src="https://cache.yisu.com/upload/information/20200310/57/121492.jpg">
<span class="cloumn-tab-new cloumn-tab-new-1 cloumn-tab2 f12">上新</span>
</a>
<div class="center fl">
<a class="h3 white-space" href="https://blog.51cto.com/cloumn/detail/13" target="_blank">基于Python的DevOps实战</a>
<h4 class="white-space">运维开发全攻略</h4>
<h5 class="white-space">共18章 | <a href="https://blog.51cto.com/yuhongchun" target="_blank">抚琴煮酒</a></h5>
<h6><span class="price">¥51.00</span><span>6人订阅</span></h6>
</div>
<div class="right fr">
<a class="cloumn-subscribe" cid="13" href="/cloumn/detail/13" ask='1' target="_blank">订阅</a>
</div>
<div class="clear"></div>
</div>
<div class="column-box">
<a href="https://blog.51cto.com/cloumn/detail/4" class="a-img fl cloumn-tab-par" target="_blank">
<img src="https://cache.yisu.com/upload/information/20200310/57/121493.jpg">
</a>
<div class="center fl">
<a class="h3 white-space" href="https://blog.51cto.com/cloumn/detail/4" target="_blank">微服务技术架构和大数据治理实战</a>
<h4 class="white-space">大数据时代的微服务之路</h4>
<h5 class="white-space">共18章 | <a href="https://blog.51cto.com/ityouknow" target="_blank">纯洁微笑</a></h5>
<h6><span class="price">¥51.00</span><span>496人订阅</span></h6>
</div>
<div class="right fr">
<a class="cloumn-subscribe" cid="4" href="/cloumn/detail/4" ask='1' target="_blank">订阅</a>
</div>
<div class="clear"></div>
</div>
</div>
<div class="column-2" >
<h3 class="column-tit">猜你喜欢</h3>
<div class="column-box">
<a class="white-space" href="https://blog.51cto.com/13118411/2154797?source=dra" target="_blank">json</a>
<a class="white-space" href="https://blog.51cto.com/13118411/2154710?source=dra" target="_blank">v0.35</a>
<a class="white-space" href="https://blog.51cto.com/laputaliya/536858?source=drt" target="_blank">JQuery ajax返回JSON时的处理方式</a>
<a class="white-space" href="https://blog.51cto.com/zhaojianping/629526?source=drt" target="_blank">android 读取json数据(遍历JSONObject和JSONArray)</a>
<a class="white-space" href="https://blog.51cto.com/huqilong/136802?source=drt" target="_blank">struts2 json jquery 集成详解</a>
<a class="white-space" href="https://blog.51cto.com/12731497/2154195?source=drh" target="_blank">谈谈Python实战数据可视化之pyplot模块</a>
<a class="white-space" href="https://blog.51cto.com/13719825/2151358?source=drh" target="_blank">用爬虫和Flask打造属于自己的电影网站,完整教程送上!</a>
<a class="white-space" href="https://blog.51cto.com/lavenliu/2150518?source=drh" target="_blank">掌握面向对象编程本质,彻底掌握OOP</a>
<div class="clear"></div>
</div>
</div>
</div>
</div>
<div class="the-lowest-bg">
<div class="the-lowest Page M764">
<p class="is-praise fl "><span type="1" blog_id="2154806" userid='13108411'></span></p>
<p class="b-favorites favorites-opt fl"><i></i><b>0</b></p>
<a class="b-reply fl"><i></i><font class="comment_number"></font></a>
<div class="b-share fl">
<i></i>分享
<div class="bdsharebuttonbox">
<a class="bds_tsina p2" data-cmd="tsina"></a>
<a class="bds_sqq p3" data-cmd="sqq"></a>
<a class="bds_weixin p1" data-cmd="weixin"><em class="code-icon"></em><img class="code-img" src="/qr/qr-url?url=http%3A%2F%2Fblog.51cto.com%2F13118411%2F2154806"></a>
</div>
</div>
<a href="https://blog.51cto.com/13118411" class="b-name fr">cooperfang</a>
<div class="is-vip-bg-6 fr">
<a href="https://blog.51cto.com/13118411" class="b-img"><img class="is-vip-img is-vip-img-4" data-uid="13108411" src="https://cache.yisu.com/upload/information/20200310/57/121491.jpg"></a>
</div>
<div class="clear"></div>
</div>
</div>
</div>
<!-- 老博文美观处理 -->
<script>
var praise_url = 'https://blog.51cto.com/praise/praise'
addReply_url = 'https://blog.51cto.com/comments/add'
removeUrl = 'https://blog.51cto.com/comments/del'
blog_id = '2154806'
rid = '0'
is_comment = '0'
comment_list = '/blog/ajax-comment-list'
comment_sort = "asc"
index_url = 'https://blog.51cto.com/13118411';
uc_url = 'http://ucenter.51cto.com/'
blog_url = 'https://blog.51cto.com/'
img_url = 'https://static1.51cto.com/edu/blog/'
i_user_id = ''
c_user_id ='13108411'
collect_url = 'https://blog.51cto.com/collect/add'
is_old = '0'
nicknameurl = 'https://blog.51cto.com/13118411'
nickname = 'cooperfang'
myself = window.location.href
$('.you-like-list li:odd').css({'margin-left': '10%'});
$('.column-box a:odd').addClass('left-list')
$('.myUrl').text(myself).click(function(){window.open(myself)})
setTimeout(function(){$('.Footer').css({'margin-top':'-50px'})},50)
if(is_old==1){SyntaxHighlighter.all();}
window._bd_share_config={
"common":{
"bdText":"天气预报定制",
"bdDesc":$("#abstract_bdshare").text(),
"bdMini":"2",
"bdMiniList":false,
"bdPic":"https://cache.yisu.com/upload/information/20200310/57/121494.jpg",
"bdStyle":"0",
"bdSize":"16"
},
"share":{}
};
with(document)0[(getElementsByTagName('head')[0]||body).appendChild(createElement('script')).src='http://bdimg.share.baidu.com/static/api/js/share.js?v=89860593.js?cdnversion='+~(-new Date()/36e5)];
setTimeout(function(){
$('.bdsharebuttonbox a').removeAttr('title')
},1000)
</script>
</div>
<script src="https://static1.51cto.com/edu/blog/js/marked.min.js?v=1.0.0.5"></script><script src="https://static1.51cto.com/edu/blog/js/highlight.js"></script><script src="https://static1.51cto.com/edu/blog/js/detail_mp.js?v=2.0.1.7"></script><script src="https://static1.51cto.com/edu/blog/js/detail.js?v=1.0.6.9"></script><script src="https://static1.51cto.com/edu/blog/js/details_new.js?v=1.1.1"></script><script src="https://static1.51cto.com/edu/blog/js/copy.js?v=1.0.0.0"></script> <script src="https://static1.51cto.com/edu/blog/js/pvlog.js"></script>
<script src="https://logs.51cto.com/rizhi/count/count.js"></script>
<script>
$(".gotop").click(function(){$(window).scrollTop(0)})
</script>
<script type="text/javascript">
//百度统计代码
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "https://hm.baidu.com/hm.js?2283d46608159c3b39fc9f1178809c21";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
//自动推送链接
(function(){
var bp = document.createElement('script');
var curProtocol = window.location.protocol.split(':')[0];
if (curProtocol === 'https') {
bp.src = 'https://zz.bdstatic.com/linksubmit/push.js';
}
else {
bp.src = 'http://push.zhanzhang.baidu.com/push.js';
}
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(bp, s);
})();
var _vds = _vds || [];
window._vds = _vds;
(function(){
_vds.push(['setAccountId', '8c51975c40edfb67']);
(function() {
var vds = document.createElement('script');
vds.type='text/javascript';
vds.async = true;
vds.src = ('https:' == document.location.protocol ? 'https://' : 'http://') + 'assets.growingio.com/vds.js';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(vds, s);
})();
})();
document.write(decodeURI("%3Cscript src='https://cache.yisu.com/upload/information/20200310/57/121495.jpg' type='text/javascript'%3E%3C/script%3E"));
</script>
<script>
var uid = '';
var BLOG_URL = 'https://blog.51cto.com/';
</script>
<script src="https://static1.51cto.com/edu//blog/js/jquery.cookie.js"></script>
<script src="https://static1.51cto.com/edu/blog/js/time-on-page.js?v=1.0.2" charset="utf-8"></script>
<script>
(function(){
var wh=$(window).height(),fh=$('.Footer').outerHeight(true),hh=$('.Header').outerHeight(true)
$('.Content-box').css({'min-height': wh-fh-hh})
})()
</script>
</body>
</html>
contents = BeautifulSoup(data.text, 'html.parser') # data.text博客文本,html.parser这个类自带的功能
# print(contents) 输出更标准化
all_p = contents.find_all('p') # 寻找p标签
all_text = ''
for p in all_p:
# print(p.text)
all_text += str(p.text) # 拼接成一个句子
print(all_text)
扫一扫体验手机阅读0分享收藏Ctrl+Enter 发布发布取消0
# pip install jieba 对中文进行拆解为独立的词语
import jieba
text = jieba.cut(all_text) # jieba.cut()
"""
Signature: jieba.cut(sentence, cut_all=False, HMM=True)
Docstring:
The main function that segments an entire sentence that contains
Chinese characters into seperated words.
"""
text_list= []
for t in text:
print(t)
text_list.append(t)
Building prefix dict from the default dictionary ...
Dumping model to file cache C:\Users\coop\AppData\Local\Temp\jieba.cache
Loading model cost 1.107 seconds.
Prefix dict has been built succesfully.
扫一扫
体验
手机
阅读
0
分享
收藏
Ctrl
+
Enter
发布
发布
取消
0
import collections # python 内置的api,以上jieba也可叫做api,收集
count = collections.Counter(text_list) # 产生一个对象count
for key, val in count.most_common(30):
# 有序(返回前n个出现次数最多的)
print(key, val)
0 2
发布 2
扫一扫 1
体验 1
手机 1
阅读 1
分享 1
收藏 1
Ctrl 1
+ 1
Enter 1
1
取消 1
# 做接口 可以给被人这个py文件,也可以是个链接
import collections
def get_most_common(text_list, max_num = 30):
"""根据max_num取排名靠前的词和出现次数"""
ret = {'status':0, "statusText":'ok', 'data':{}} # api通用格式
try:
new_list = list(text_list)
count = collections.Counter(new_list)
ret['data'] = count.most_common(max_num)
except Exception as e:
ret['status'] = 1
ret['statusText'] = e
return ret
get_most_common(text_list)
{'status': 0,
'statusText': 'ok',
'data': [('0', 2),
('发布', 2),
('扫一扫', 1),
('体验', 1),
('手机', 1),
('阅读', 1),
('分享', 1),
('收藏', 1),
('Ctrl', 1),
('+', 1),
('Enter', 1),
('\xa0', 1),
('取消', 1)]}
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。