Friday, September 10, 2010

Split list into number of Pieces

Today, while working with csv files i got into fantastic situation where i have a list of million values and i make iteration on that. So i thought, it would be easy for me if i split list into number of pieces which dont affect my code, memory and CPU. I can also use generator expression to make make my code run faster, but i was very curious to write code to split list into number of pieces(uses little of generator exp).
And the result is here -


    import os,sys

    def split_seq(seq, num_pieces):
        """ split a list into pieces passed as param """
        start = 0
        for i in xrange(num_pieces):
            stop = start + len(seq[i::num_pieces])
            yield seq[start:stop]
            start = stop

    seq = [i for i in range(100)]   ## define your list here
    num_of_pieces = 3
    for seq in split_seq(seq, num_of_pieces):
        print len(seq), '-> ',seq

Friday, September 3, 2010

Dynamicaly open file

Recently i had gone through a situation like to split a 40GB csv file for further processing, into 60 pieces having name/content to be decided dynamically based on a id present in that csv file. That really make me to write some little code to open/write/close file dynamically. I wrote following code to achieve this.

import os, sys
a=range(10)
for each in a:
    s = "fl_%s = open('%s','a')" % (each,each)
    exec s
    exec "fl_%s.write('%s')" % (each,each)
    com = "fl_%s.close()" % (each)
    exec com
    # can also check if file is closed or open by
    # com = "fl_%s.closed" % (each)
    # bool(com) #return true if file is closed else false
 

Monday, August 30, 2010

Join integer value

While working with csv module of python i got very interesting thing about join.
I was reading a huge csv file line by line and for some kind of operation i converted that row to list and again that list to string. But, my row consists of some integer values so i always get
TypeError: sequence item 5: expected string, int found

So, i am writing small code to let new guys know about this.
i have a list and i want to join this.
ls=['a','b',4,'c']
','.join(ls)
ends up with  : TypeError: sequence item 2: expected string, int found

do,
','.join(map(str,ls))

Wednesday, August 25, 2010

Dictionary as Generator

What will you do if you are creating dictionary structure dynamically, and it got millions of keys?
Accessing that dictionary later in your code might get some resource. can't it?
I also hanged on this kind of situation and my dictionay got 10K millions key. So i used dictionary as generator to make my work easy.
Folloing code just explain how to use dictionary as generator.

[code]
a=range(100000)
b=range(100000)
c=dict(zip(a,b)) #create dictionary with 100000 keys
d_len=len(c)
d_keys = (k for k in c.keys())   # generator expression
for i in range(d_len):
   key = d_keys.next()
       .
    .
    .
   ## do your operation on keys
[/code]

Monday, August 23, 2010

Rename multiple file simultaneously

Renaming multiple file  once is really little confusing using command-line. There are lots of way to do it via programming, but yet, i didnt fine any on-the-spot command to do it.
So i used python to do it simply.
My requirement was actually:
1) i have one dedicated folder, where i have to rename all files.
2) all filename to be renamed are structured, i mean, i have to rename all dedupe_<number>.csv to <number>.csv

I did this using following code,

import os
from os import listdir, getcwd, rename


list_files = listdir(getcwd())
for filename in list_files:
    if not filename.startswith('.') and 'dedupe_' in filename:
        ext = filename.split('.')[-1]
        new_name = ''.join(filename.split('.')[:-1]).replace('dedupe_','')+'.'+ext
        cmd = 'mv '+filename + ' ' +new_name
        os.popen(cmd)

isn't it very very simple !!!!

Subversion with SSL


I have recently installed SVN to my system, and configured it with SSL. Adding it here might help me further or other people can get helped.

1. Install apache(httpd)
            sudo ./configure --prefix=/opt/vivek/apache --enable-dav --enable-so --enable-ssl
            ## if this gives you error like "configure: error: ...No recognized SSL/TLS toolkit detected" then install
            ## apt-get install openssl libssl-dev
            sudo make
            sudo make install

2. Install dependency for subversion (check dependency using sh ./autogen.sh)

            1. Install sqlite
            2. Get the sqlite 3.6.13 amalgamation from:
                        http://www.sqlite.org/sqlite-amalgamation-3.6.13.tar.gz
                        Unpack the archive using tar/gunzip and copy sqlite3.c from the
                        Resulting directory to:
                        /home/vivek/Desktop/TGZS/subversion-1.6.12/sqlite-amalgamation/sqlite3.c
                        This file also ships as part of the subversion-deps distribution.
            3. You need autoconf version 2.50 or newer installed (i used synaptic)
            4. You need libtool version 1.4 or newer installed

3. Install subversion now.
            sudo ./configure --prefix=/opt/vivek/subversion --with-apxs=/opt/vivek/apache/bin/apxs --with-apr=/opt/vivek/apache/bin/apr-1-config --with-apr-util=/opt/vivek/apache/bin/apu-1-config  --with-ssl
            sudo make
            sudo make install

4. after Installation
            groupadd svn
            useradd -m -d /srv/svn/ -g svn svn
            After adding user i go to user and groups and make the user enable(add password 123456)

5.
            su - svn (give password of svn user - 123456)
            $ mkdir /srv/svn/repositories/
            $ mkdir /srv/svn/repositories/myproduct
            $ mkdir /srv/svn/conf
            $ /opt/vivek/subversion/bin/svnadmin create /srv/svn/repositories/myproduct


6. Add following to apache/conf/httpd.conf, for http access to users
            <Location /repos>
            DAV svn
            SVNParentPath /srv/svn/repositories
            # our access control policy
            AuthzSVNAccessFile /srv/svn/conf/users-access-file
            # try anonymous access first, resort to real
            # Authentication if necessary.
            Satisfy Any
            Require valid-user
            # how to authenticate a user
            AuthType Basic
            AuthName "Subversion repository"
            AuthUserFile /srv/svn/conf/passwd
            </Location>

            CustomLog logs/svn_logfile "%t %u %{SVN-ACTION}e" env=SVN-ACTION

            That file, /srv/svn/conf/passwd, can be created using apache/bin/htpasswd:
            htpasswd -m -c /srv/svn/conf/passwd vivek (use htpasswd --help first for options)
            it will prompt you to password for vivek

            ** This way you can add user for http access.

            Add following to /srv/svn/conf/users-access-file to set permission for user.
            [/]
            * =
            [myproduct:/]
            vivek1 = rw
            vivek2 = r

            run svnserve for required location
            /opt/vivek/subversion/bin/svnserve -d -r /srv/svn/repositories/myproduct

7. Now access url http://localhost/repos/myproduct,


8. Add project as
            sudo /opt/vivek/subversion/bin/svn import myproduct file:///srv/svn/repositories/myproduct -m "added project"
            /opt/vivek/subversion/bin/svn ls svn://localhost/myproduct


9. You can add permission to myproduct folder by changing /srv/svn/repositories/myproduct/conf/passwd and svnserve.conf file.

    Add following to svnserve.conf
                        [general]
                        anon-access = read
                        auth-access = write
                        password-db = passwd
                        authz-db = authz
                        # realm = My First Repository
                        [sasl]
                        use-sasl = true


   Add following to /srv/svn/repositories/myproduct/conf/authz

                        [groups]
                        group1 = vivek1
                        group2 = vivek2

                        [/]
                        vivek = rw
                        *=

                        [myproduct:/]
                        @group1 = rw
                        [myproduct:/]
                        @group2 = r    ## this wont allow user to do svn co or commit

10. If you want to disable credential caching permanently, you can edit your runtime config file (located in /home/vivek/.subversion/config).

                        [auth]
                        store-auth-creds = no

Thanks to http://queens.db.toronto.edu/~nilesh/linux/subversion-howto/

Friday, August 20, 2010

Call Python script from Java.

I am not good at core java programming, but good at "Hello World" kind of program :) .
SO i wrote a Java program to call python script(can also pass arg values). Take a look.

This is Java code
    import java.io.*;

    // run this way
    // javac JavaRunCommand.java
    // java -classpath . JavaRunCommand

    public class JavaRunCommand {

        public static void main(String args[]) {

        String st = null;

        try {

            String[]callAndArgs= {"python","my_python.py","arg1","arg2"};
            Process p = Runtime.getRuntime().exec(callAndArgs);
           
            BufferedReader stdInput = new BufferedReader(new
                 InputStreamReader(p.getInputStream()));

            BufferedReader stdError = new BufferedReader(new
                 InputStreamReader(p.getErrorStream()));

            // read the output
            while ((s = stdInput.readLine()) != null) {
                System.out.println(s);
            }
           
            // read any errors
            while ((s = stdError.readLine()) != null) {
                System.out.println(s);
            }
           
            System.exit(0);
        }
        catch (IOException e) {
            System.out.println("exception occured");
            e.printStackTrace();
            System.exit(-1);
        }
        }
    }

In above java code, i am calling my_python.py script. That script might contain anything-wxPython, mod-python, cgi programming, just anything.