Today, while working with csv files i got into fantastic situation where i have a list of million values and i make iteration on that. So i thought, it would be easy for me if i split list into number of pieces which dont affect my code, memory and CPU. I can also use generator expression to make make my code run faster, but i was very curious to write code to split list into number of pieces(uses little of generator exp).
And the result is here -
import os,sys
def split_seq(seq, num_pieces):
""" split a list into pieces passed as param """
start = 0
for i in xrange(num_pieces):
stop = start + len(seq[i::num_pieces])
yield seq[start:stop]
start = stop
seq = [i for i in range(100)] ## define your list here
num_of_pieces = 3
for seq in split_seq(seq, num_of_pieces):
print len(seq), '-> ',seq
Friday, September 10, 2010
Friday, September 3, 2010
Dynamicaly open file
Recently i had gone through a situation like to split a 40GB csv file for further processing, into 60 pieces having name/content to be decided dynamically based on a id present in that csv file. That really make me to write some little code to open/write/close file dynamically. I wrote following code to achieve this.
import os, sys
a=range(10)
for each in a:
s = "fl_%s = open('%s','a')" % (each,each)
exec s
exec "fl_%s.write('%s')" % (each,each)
com = "fl_%s.close()" % (each)
exec com
# can also check if file is closed or open by
# com = "fl_%s.closed" % (each)
# bool(com) #return true if file is closed else false
import os, sys
a=range(10)
for each in a:
s = "fl_%s = open('%s','a')" % (each,each)
exec s
exec "fl_%s.write('%s')" % (each,each)
com = "fl_%s.close()" % (each)
exec com
# can also check if file is closed or open by
# com = "fl_%s.closed" % (each)
# bool(com) #return true if file is closed else false
Monday, August 30, 2010
Join integer value
While working with csv module of python i got very interesting thing about join.
I was reading a huge csv file line by line and for some kind of operation i converted that row to list and again that list to string. But, my row consists of some integer values so i always get
TypeError: sequence item 5: expected string, int found
So, i am writing small code to let new guys know about this.
i have a list and i want to join this.
ls=['a','b',4,'c']
','.join(ls)
ends up with : TypeError: sequence item 2: expected string, int found
do,
','.join(map(str,ls))
I was reading a huge csv file line by line and for some kind of operation i converted that row to list and again that list to string. But, my row consists of some integer values so i always get
TypeError: sequence item 5: expected string, int found
So, i am writing small code to let new guys know about this.
i have a list and i want to join this.
ls=['a','b',4,'c']
','.join(ls)
ends up with : TypeError: sequence item 2: expected string, int found
do,
','.join(map(str,ls))
Wednesday, August 25, 2010
Dictionary as Generator
What will you do if you are creating dictionary structure dynamically, and it got millions of keys?
Accessing that dictionary later in your code might get some resource. can't it?
I also hanged on this kind of situation and my dictionay got 10K millions key. So i used dictionary as generator to make my work easy.
Folloing code just explain how to use dictionary as generator.
[code]
a=range(100000)
b=range(100000)
c=dict(zip(a,b)) #create dictionary with 100000 keys
d_len=len(c)
d_keys = (k for k in c.keys()) # generator expression
for i in range(d_len):
key = d_keys.next()
.
.
.
## do your operation on keys
[/code]
Accessing that dictionary later in your code might get some resource. can't it?
I also hanged on this kind of situation and my dictionay got 10K millions key. So i used dictionary as generator to make my work easy.
Folloing code just explain how to use dictionary as generator.
[code]
a=range(100000)
b=range(100000)
c=dict(zip(a,b)) #create dictionary with 100000 keys
d_len=len(c)
d_keys = (k for k in c.keys()) # generator expression
for i in range(d_len):
key = d_keys.next()
.
.
.
## do your operation on keys
[/code]
Monday, August 23, 2010
Rename multiple file simultaneously
Renaming multiple file once is really little confusing using command-line. There are lots of way to do it via programming, but yet, i didnt fine any on-the-spot command to do it.
So i used python to do it simply.
My requirement was actually:
1) i have one dedicated folder, where i have to rename all files.
2) all filename to be renamed are structured, i mean, i have to rename all dedupe_<number>.csv to <number>.csv
I did this using following code,
import os
from os import listdir, getcwd, rename
list_files = listdir(getcwd())
for filename in list_files:
if not filename.startswith('.') and 'dedupe_' in filename:
ext = filename.split('.')[-1]
new_name = ''.join(filename.split('.')[:-1]).replace('dedupe_','')+'.'+ext
cmd = 'mv '+filename + ' ' +new_name
os.popen(cmd)
isn't it very very simple !!!!
So i used python to do it simply.
My requirement was actually:
1) i have one dedicated folder, where i have to rename all files.
2) all filename to be renamed are structured, i mean, i have to rename all dedupe_<number>.csv to <number>.csv
I did this using following code,
import os
from os import listdir, getcwd, rename
list_files = listdir(getcwd())
for filename in list_files:
if not filename.startswith('.') and 'dedupe_' in filename:
ext = filename.split('.')[-1]
new_name = ''.join(filename.split('.')[:-1]).replace('dedupe_','')+'.'+ext
cmd = 'mv '+filename + ' ' +new_name
os.popen(cmd)
isn't it very very simple !!!!
Subversion with SSL
I have recently installed SVN to my system, and configured it with SSL. Adding it here might help me further or other people can get helped.
1. Install apache(httpd)
sudo ./configure --prefix=/opt/vivek/apache --enable-dav --enable-so --enable-ssl
## if this gives you error like "configure: error: ...No recognized SSL/TLS toolkit detected" then install
## apt-get install openssl libssl-dev
sudo make
sudo make install
2. Install dependency for subversion (check dependency using sh ./autogen.sh)
1. Install sqlite
2. Get the sqlite 3.6.13 amalgamation from:
http://www.sqlite.org/sqlite-amalgamation-3.6.13.tar.gz
Unpack the archive using tar/gunzip and copy sqlite3.c from the
Resulting directory to:
/home/vivek/Desktop/TGZS/subversion-1.6.12/sqlite-amalgamation/sqlite3.c
This file also ships as part of the subversion-deps distribution.
3. You need autoconf version 2.50 or newer installed (i used synaptic)
4. You need libtool version 1.4 or newer installed
3. Install subversion now.
sudo ./configure --prefix=/opt/vivek/subversion --with-apxs=/opt/vivek/apache/bin/apxs --with-apr=/opt/vivek/apache/bin/apr-1-config --with-apr-util=/opt/vivek/apache/bin/apu-1-config --with-ssl
sudo make
sudo make install
4. after Installation
groupadd svn
useradd -m -d /srv/svn/ -g svn svn
After adding user i go to user and groups and make the user enable(add password 123456)
5.
su - svn (give password of svn user - 123456)
$ mkdir /srv/svn/repositories/
$ mkdir /srv/svn/repositories/myproduct
$ mkdir /srv/svn/conf
$ /opt/vivek/subversion/bin/svnadmin create /srv/svn/repositories/myproduct
6. Add following to apache/conf/httpd.conf, for http access to users
<Location /repos>
DAV svn
SVNParentPath /srv/svn/repositories
# our access control policy
AuthzSVNAccessFile /srv/svn/conf/users-access-file
# try anonymous access first, resort to real
# Authentication if necessary.
Satisfy Any
Require valid-user
# how to authenticate a user
AuthType Basic
AuthName "Subversion repository"
AuthUserFile /srv/svn/conf/passwd
</Location>
CustomLog logs/svn_logfile "%t %u %{SVN-ACTION}e" env=SVN-ACTION
That file, /srv/svn/conf/passwd, can be created using apache/bin/htpasswd:
htpasswd -m -c /srv/svn/conf/passwd vivek (use htpasswd --help first for options)
it will prompt you to password for vivek
** This way you can add user for http access.
Add following to /srv/svn/conf/users-access-file to set permission for user.
[/]
* =
[myproduct:/]
vivek1 = rw
vivek2 = r
run svnserve for required location
/opt/vivek/subversion/bin/svnserve -d -r /srv/svn/repositories/myproduct
7. Now access url http://localhost/repos/myproduct,
8. Add project as
sudo /opt/vivek/subversion/bin/svn import myproduct file:///srv/svn/repositories/myproduct -m "added project"
/opt/vivek/subversion/bin/svn ls svn://localhost/myproduct
9. You can add permission to myproduct folder by changing /srv/svn/repositories/myproduct/conf/passwd and svnserve.conf file.
Add following to svnserve.conf
[general]
anon-access = read
auth-access = write
password-db = passwd
authz-db = authz
# realm = My First Repository
[sasl]
use-sasl = true
Add following to /srv/svn/repositories/myproduct/conf/authz
[groups]
group1 = vivek1
group2 = vivek2
[/]
vivek = rw
*=
[myproduct:/]
@group1 = rw
[myproduct:/]
@group2 = r ## this wont allow user to do svn co or commit
10. If you want to disable credential caching permanently, you can edit your runtime config file (located in /home/vivek/.subversion/config).
[auth]
store-auth-creds = no
Thanks to http://queens.db.toronto.edu/~nilesh/linux/subversion-howto/
Friday, August 20, 2010
Call Python script from Java.
I am not good at core java programming, but good at "Hello World" kind of program :) .
SO i wrote a Java program to call python script(can also pass arg values). Take a look.
This is Java code
import java.io.*;
// run this way
// javac JavaRunCommand.java
// java -classpath . JavaRunCommand
public class JavaRunCommand {
public static void main(String args[]) {
String st = null;
try {
String[]callAndArgs= {"python","my_python.py","arg1","arg2"};
Process p = Runtime.getRuntime().exec(callAndArgs);
BufferedReader stdInput = new BufferedReader(new
InputStreamReader(p.getInputStream()));
BufferedReader stdError = new BufferedReader(new
InputStreamReader(p.getErrorStream()));
// read the output
while ((s = stdInput.readLine()) != null) {
System.out.println(s);
}
// read any errors
while ((s = stdError.readLine()) != null) {
System.out.println(s);
}
System.exit(0);
}
catch (IOException e) {
System.out.println("exception occured");
e.printStackTrace();
System.exit(-1);
}
}
}
In above java code, i am calling my_python.py script. That script might contain anything-wxPython, mod-python, cgi programming, just anything.
SO i wrote a Java program to call python script(can also pass arg values). Take a look.
This is Java code
import java.io.*;
// run this way
// javac JavaRunCommand.java
// java -classpath . JavaRunCommand
public class JavaRunCommand {
public static void main(String args[]) {
String st = null;
try {
String[]callAndArgs= {"python","my_python.py","arg1","arg2"};
Process p = Runtime.getRuntime().exec(callAndArgs);
BufferedReader stdInput = new BufferedReader(new
InputStreamReader(p.getInputStream()));
BufferedReader stdError = new BufferedReader(new
InputStreamReader(p.getErrorStream()));
// read the output
while ((s = stdInput.readLine()) != null) {
System.out.println(s);
}
// read any errors
while ((s = stdError.readLine()) != null) {
System.out.println(s);
}
System.exit(0);
}
catch (IOException e) {
System.out.println("exception occured");
e.printStackTrace();
System.exit(-1);
}
}
}
In above java code, i am calling my_python.py script. That script might contain anything-wxPython, mod-python, cgi programming, just anything.
Subscribe to:
Posts (Atom)