View Full Version : I Sed, she sed
Does anyone here use sed much? I've got a bit stuck, could do with some help! :(
Does anyone here use sed much? I've got a bit stuck, could do with some help! :(
Whatcha trying to achieve? I use it on and off here.
Right, this is a simple example. In real times, my data is a bit hairy but the idea should work.
I have a file populated a little like this:
abcdefghijklmn<<o>>pqrstuvwxyz
abcdefghijklmnopqrstuvwxyz
abcdefg<h>ijklmnopqrstuvwxyz
abcdefghijkl<<m>>nopqr<<s>>tuvwxyz
abcdef<g>hijklmnopqrs<<t>>uvwxyz
abcdefghijklmno<<p>>qrstuvwxyz
I need to use sed to extract out anything surrounded by << >>. So ideally I'd end up with:
o
m
s
t
p
It's frustrating me somewhat.
The fact you can have more than one pattern per line caused me some major grief, but if it's OK to use sed twice, then I managed to get around that one:
sed '/\(<<[^<>]*>>\)/s//\n\1\n/g' | sed '/^<<\([^<>]*\)>>$/s//\1/p;d'
The first sed breaks out the <<...>> so that each is split out onto a line by itself, then the second one picks out just those lines.
Thanks Mark. I've not used sed much, takes some getting used to.
Make sure you use the latest edit - I spotted a few bits of redundant code in the original and removed them.
sed '/\(<<[^<>]*>>\)/s//\n\1\n/g' | sed '/^<<\([^<>]*\)>>$/s//\1/p;d'
Such language on a family forum!
I do wonder what that would say when translated into semaphore and back. ;D ;D
OK, second challange. This should be easy, I've nearly got it working but seem to be running in circles.
Say you have this:
{.One$="One",.Number=12,.Two$="Two",.Number=13,.Three$="Three",/
I need to grab everything between .Two$=" and the following "
A pint to whoever digs me out of sed hell #2.
I really should buy a book on this...
http://www.amazon.co.uk/sed-awk-Pocket-Reference-OReilly/dp/0596003528/ref=sr_1_2?ie=UTF8&s=books&qid=1214233854&sr=8-2
LeperousDust
23-06-2008, 16:23
Jesus, that actually looks foreign to me... :huh:
Book shmook, got there in the end.
Still, it's a handy thing to learn.
The fact you can have more than one pattern per line caused me some major grief, but if it's OK to use sed twice, then I managed to get around that one:
sed '/\(<<[^<>]*>>\)/s//\n\1\n/g' | sed '/^<<\([^<>]*\)>>$/s//\1/p;d'
The first sed breaks out the <<...>> so that each is split out onto a line by itself, then the second one picks out just those lines.
Just realised this screws up when it comes across something like:
abcdefgh<<ijklmno>pqrst>>uvwxyz
abcdefgh<<ijklmno<pqrst>>uvwxyz
:'(
You may have reached the limits of what sed can do for you there. It'll also break with this one:
abcdefgh<<ijklmno<>pqrst>>uvwxyz
Any suggestions, Mr Mark? I think I need to buy you a pub... ;D
Windows or Linux? There's a way to cheat it that I hadn't thought of before, but I'll have to put it into a script as it uses unprintable characters.
Linux. Any suggestions would be more than welcome.
OK - one more.
If I gave it:
abcdefg<<<<hijkl>>mno>>pqrstuvwxyz
Would the answer 'hijkl' be OK, or do you want '<<hijkl>>mno', or is that just a case of 'never going to happen'?
I've solved all the other cases, but this one might resist.
It'd never exist. The '<<' and '>>' will always match, there may be more than 1 per line, they would never overlap, and there may be single '<' and '>' in each.
http://www.ouroboros.me.uk/misc/for-goose.sh
Converts:
abcdefghijklmn<<o>>pqrstuvwxyz
abcdefghijklmnopqrstuvwxyz
abcdefg<h>ijklmnopqrstuvwxyz
abcdefghijkl<<m>>nopqr<<s>>tuvwxyz
abcdef<g>hijklmnopqrs<<t>>uvwxyz
abcdefghijklmno<<p>>qrstuvwxyz
abcdefgh<<ijklmno>pqrst>>uvwxyz
abcdefgh<<ijklmno<pqrst>>uvwxyz
abcdefgh<<ijklmno<>pqrst>>uvwxyz
abcdefg<<<<hijkl>>mno>>pqrstuvwxyz
To:
o
m
s
t
p
ijklmno>pqrst
ijklmno<pqrst
ijklmno<>pqrst
hijkl
I think (hope) that's everything. ;D
vBulletin® v3.7.4, Copyright ©2000-2025, Jelsoft Enterprises Ltd.