Nginx服务整理日志分析（shell+python）的两种方法

python脚本

3.线上nginx的一次“no live upstreams while connecting to upstream ”分析

4.nginx proxy_pass末端神奇的斜线

5.nginx+tomcat使用apache的FtpClient上传图片时由于多线程问题导致的文件大小为0的问题

案例一
ip - - [23/Mar/2017:00:17:49 +0800] "GET / HTTP/1.1" 302 0 "-" "PycURL/7.19.7"
 
log_format access '$HTTP_X_REAL_IP - $remote_user [$time_local] "$request"'
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" $HTTP_X_Forwarded_For';
 
192.168.21.1 - - [27/Jan/2014:11:28:53 +0800] "GET /2.php HTTP/1.1" 200 133 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1707.0 Safari/537.36" "-"192.168.21.128 200 127.0.0.1:9000 0.119 0.119
 
#log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '    
#                  '$status $body_bytes_sent "$http_referer" '
#                  '"$http_user_agent" "$http_x_forwarded_for"';
 
$http_host：用户在浏览器中输入的URL（IP或着域名）地址  192.168.21.128
$upstream_status： upstream状态    200
$upstream_addr： 后端upstream地址及端口  127.0.0.1:9000
$request_time： 页面访问总时间  0.119
$upstream_response_time：页面访问中upstream响应时间   0.119
 
$10 $body_bytes_sent
$1  $remote_addr
$7  $request
$11 $http_referer
$9  $status
$6  http_user_agent
 
1、总访问量
2、总带宽
3、独立访客量
4、访问IP统计
5、访问url统计
6、来源统计
7、404统计
8、搜索引擎访问统计(谷歌，百度)
9、搜索引擎来源统计(谷歌，百度)
 
#!/bin/bash
log_path=/home/HdhCmsTestcentos.bz/log/access.log.1
domain="centos.bz"
email="log@centos.bz"
maketime=`date +%Y-%m-%d" "%H":"%M`
logdate=`date -d "yesterday" +%Y-%m-%d`
total_visit=`wc -l ${log_path} | awk '{print $1}'`
total_bandwidth=`awk -v total=0 '{total+=$10}END{print total/1024/1024}' ${log_path}`
total_unique=`awk '{ip[$1]++}END{print asort(ip)}' ${log_path}`
ip_pv=`awk '{ip[$1]++}END{for (k in ip){print ip[k],k}}' ${log_path} | sort -rn | head -20`
url_num=`awk '{url[$7]++}END{for (k in url){print url[k],k}}' ${log_path} | sort -rn | head -20`
referer=`awk -v domain=$domain '$11 !~ 
/http:\/\/[^/]*'"$domain"'/{url[$11]++}END{for (k in url){print 
url[k],k}}' ${log_path} | sort -rn | head -20`
notfound=`awk '$9 == 404 {url[$7]++}END{for (k in url){print url[k],k}}' ${log_path} | sort -rn | head -20`
spider=`awk -F'"' '$6 ~ /Baiduspider/ {spider["baiduspider"]++} $6 ~
 /Googlebot/ {spider["googlebot"]++}END{for (k in spider){print 
k,spider[k]}}'  ${log_path}`
search=`awk -F'"' '$4 ~ /http:\/\/www\.baidu\测试数据/ 
{search["baidu_search"]++} $4 ~ /http:\/\/www\.google\测试数据/ 
{search["google_search"]++}END{for (k in search){print k,search[k]}}' 
${log_path}`
#echo -e "概况\n报告生成时间：${maketime}\n总访问量:${total_visit}\n总带宽:${total_bandwidth}M\n独
立访客:${total_unique}\n\n访问IP统计\n${ip_pv}\n\n访问url统计\n${url_num}\n\n来源页面统计
\n${referer}\n\n404统计\n${notfound}\n\n蜘蛛统计\n${spider}\n\n搜索引擎来源统计
\n${search}" | mail -s "$domain $logdate log statistics" ${email}

案例二
# tar zxvf pymongo-1.11.tar.gz
# cd pymongo-1.11
# python setup.py install
python连接mongodb样例
$ cat conn_mongodb.py 
#!/usr/bin/python
   
import pymongo
import random
   
conn = pymongo.Connection("127.0.0.1",27017)
db = conn.tage #连接库
db.authenticate("tage","123")
#用户认证
db.user.drop()
#删除集合user
db.user.save({'id':1,'name':'kaka','sex':'male'})
 #插入一个数据
for id in range(2,10):
    name = random.choice(['steve','koby','owen','tody','rony'])
    sex = random.choice(['male','female'])
    db.user.insert({'id':id,'name':name,'sex':sex}) 
#通过循环插入一组数据
content = db.user.find()
#打印所有数据
for i in content:
    print i
 
编写python脚本
#encoding=utf8
   
import re
   
zuidaima_nginx_log_path="/usr/local/nginx/logs/HdhCmsTestzuidaima测试数据.access.log"
pattern = re测试数据pile(r'^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}')
   
def stat_ip_views(log_path):
    ret={}
    f = open(log_path, "r")
    for line in f:
        match = pattern.match(line)
        if match:
            ip=match.group(0)
            if ip in ret:
                views=ret[ip]
            else:
                views=0
            views=views+1
            ret[ip]=views
    return ret
def run():
    ip_views=stat_ip_views(zuidaima_nginx_log_path)
    max_ip_view={}
    for ip in ip_views:
        views=ip_views[ip]
        if len(max_ip_view)==0:
            max_ip_view[ip]=views
        else:
            _ip=max_ip_view.keys()[0]
            _views=max_ip_view[_ip]
            if views>_views:
                max_ip_view[ip]=views
                max_ip_view.pop(_ip)
   
        print "ip:", ip, ",views:", views
    #总共有多少ip
    print "total:", len(ip_views)
    #最大访问的ip
    print "max_ip_view:", max_ip_view
   
run()

以上就是Nginx服务整理日志分析（shell+python）的两种方法的详细内容，更多请关注Gxl网其它相关文章！

声明：本文来自网络，不代表【好得很程序员自学网】立场，转载请注明出处：http://www.haodehen.cn/did85389

更新时间：2022-10-19 阅读：54次