Tuesday, November 23, 2010

Extract comments in your python code

Comments in your code are always good for new person to understand the logic and flow of program. I guess, people likes programming more, if they comment properly, same way as i do. Proper commenting will not allow you to read code line by line. The best way to comment is to comment in such a way, you can extract it easily. I do comments in my python code using '##' which makes me easy to extract them from script. Following code extracts my comment:


import os
import sys
## python extract_comment.py filename comment_delimeter
filename = sys.argv[1]
sepr = sys.argv[2]
fl = open(filename,'r')
fl_con = fl.readlines()
for row in fl_con:
    if sepr in row:
        ind = row.find(sepr)
    print row[ind:]

fl.close()



copy the code in file extract_comment.py, filename is the python script from where you need to extract comments and comment_delimeter is '##' in my case.

Tuesday, November 2, 2010

Simple Web Server in python

Recently, I was hanging arround flex codes which calls python script resides on other server through web services. I got confused, Is it a good idea to use web service just to call python script from other server? Why not to use cgi module or mod-python to get the same result as getting through web services?

So i decided to write a simple web server which has some methods to be called as a URL. Got excellent help from
http://fragments.turtlemeat.com/pythonwebserver.php
then, i added some code.

        import string,cgi,time
        from os import curdir, sep
        from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

        class VivekServer(BaseHTTPRequestHandler):

            def do_GET(self):
            try:
                if self.path == '/fetch':
                self.send_response(200)
                self.send_header('Content-type',        'text/html')
                self.end_headers()
                res = self.wcount()
                self.wfile.write("Number of count for 'anyword' :")
                self.wfile.write(res[0])
                self.wfile.write(" url is :")
                self.wfile.write(res[1])
                return
                if self.path == '/calculate':           
                self.send_response(200)
                self.send_header('Content-type',        'text/html')
                self.end_headers()
                res = self.calculate()
                for each in res:
                    self.wfile.write(each)
                    self.wfile.write('\n')
                return

                return
               
            except IOError:
                self.send_error(404,'File Not Found: %s' % self.path)
            
            def calculate(self):
            import random
            WORD = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
            data = []
            for i in range(1, 100):
                data.append((random.randrange(0, 1000), random.sample(WORD, len(WORD))[0]))
            return data

            def do_POST(self):
            pass

            def wcount(self):
            from BeautifulSoup import BeautifulSoup as soup
            import urllib2

            htm  = 'http://www.anyurl.com'
            html_text = urllib2.urlopen( htm ) .read()
            sp = soup(html_text)
            idea = sp.findAll( "anyword" )   
            all_a = [ each.get('href') for each in sp.findAll('a') ]
            num = 0
            for each in all_a:
                if each.find("anyword") > 0:
                num=num+1
            return (idea,num,htm)

        def main():
            try:
            server = HTTPServer(('', 7999), VivekServer)
            print 'Server Started.....'
            server.serve_forever()
            except KeyboardInterrupt:
            print 'Server Ends.....'
            server.socket.close()

        if __name__ == '__main__':
            main()

To run above code, just do, python abovecode.py

open web browser, type url as
http://localhost:7999/fetch
http://localhost:7999/calculate

Friday, September 10, 2010

Split list into number of Pieces

Today, while working with csv files i got into fantastic situation where i have a list of million values and i make iteration on that. So i thought, it would be easy for me if i split list into number of pieces which dont affect my code, memory and CPU. I can also use generator expression to make make my code run faster, but i was very curious to write code to split list into number of pieces(uses little of generator exp).
And the result is here -


    import os,sys

    def split_seq(seq, num_pieces):
        """ split a list into pieces passed as param """
        start = 0
        for i in xrange(num_pieces):
            stop = start + len(seq[i::num_pieces])
            yield seq[start:stop]
            start = stop

    seq = [i for i in range(100)]   ## define your list here
    num_of_pieces = 3
    for seq in split_seq(seq, num_of_pieces):
        print len(seq), '-> ',seq

Friday, September 3, 2010

Dynamicaly open file

Recently i had gone through a situation like to split a 40GB csv file for further processing, into 60 pieces having name/content to be decided dynamically based on a id present in that csv file. That really make me to write some little code to open/write/close file dynamically. I wrote following code to achieve this.

import os, sys
a=range(10)
for each in a:
    s = "fl_%s = open('%s','a')" % (each,each)
    exec s
    exec "fl_%s.write('%s')" % (each,each)
    com = "fl_%s.close()" % (each)
    exec com
    # can also check if file is closed or open by
    # com = "fl_%s.closed" % (each)
    # bool(com) #return true if file is closed else false
 

Monday, August 30, 2010

Join integer value

While working with csv module of python i got very interesting thing about join.
I was reading a huge csv file line by line and for some kind of operation i converted that row to list and again that list to string. But, my row consists of some integer values so i always get
TypeError: sequence item 5: expected string, int found

So, i am writing small code to let new guys know about this.
i have a list and i want to join this.
ls=['a','b',4,'c']
','.join(ls)
ends up with  : TypeError: sequence item 2: expected string, int found

do,
','.join(map(str,ls))

Wednesday, August 25, 2010

Dictionary as Generator

What will you do if you are creating dictionary structure dynamically, and it got millions of keys?
Accessing that dictionary later in your code might get some resource. can't it?
I also hanged on this kind of situation and my dictionay got 10K millions key. So i used dictionary as generator to make my work easy.
Folloing code just explain how to use dictionary as generator.

[code]
a=range(100000)
b=range(100000)
c=dict(zip(a,b)) #create dictionary with 100000 keys
d_len=len(c)
d_keys = (k for k in c.keys())   # generator expression
for i in range(d_len):
   key = d_keys.next()
       .
    .
    .
   ## do your operation on keys
[/code]

Monday, August 23, 2010

Rename multiple file simultaneously

Renaming multiple file  once is really little confusing using command-line. There are lots of way to do it via programming, but yet, i didnt fine any on-the-spot command to do it.
So i used python to do it simply.
My requirement was actually:
1) i have one dedicated folder, where i have to rename all files.
2) all filename to be renamed are structured, i mean, i have to rename all dedupe_<number>.csv to <number>.csv

I did this using following code,

import os
from os import listdir, getcwd, rename


list_files = listdir(getcwd())
for filename in list_files:
    if not filename.startswith('.') and 'dedupe_' in filename:
        ext = filename.split('.')[-1]
        new_name = ''.join(filename.split('.')[:-1]).replace('dedupe_','')+'.'+ext
        cmd = 'mv '+filename + ' ' +new_name
        os.popen(cmd)

isn't it very very simple !!!!