takmedia How to find patterns across multiple lines using grep?


How to find patterns across multiple lines using grep?



I want to find files that have "abc" AND "efg" in that order, and those two strings are on different lines in that file. Eg: a file with content:

blah blah.. blah blah.. blah abc blah blah blah.. blah blah.. blah blah.. blah efg blah blah blah blah.. blah blah.. 

Should be matched.


API to translate group name to group id (gid)

1:

What scripts should not be ported from bash to python?
Grep is not sufficient for this operation.. Vim + OmniCppComplete: Completing on Class Members which are STL containers pcregrep which is found in most of the modern Linux systems must be used as. How do I bring a processes window to the foreground on X Windows? (C++)
pcregrep -M  'abc.*(\n|.)*efg' test.txt 
How communicate with pty via minicom or screen?Script to parse emails for attachments

2:

When assert() fails, what is the program exit code?
I'm not sure if it is possible with grep, although sed makes it very easy:. call external program in python, watch output for specific text then take action
sed -e '/abc/,/efg/!d' [file-with-content] 

3:

here's a quick fix inspired by http://stackoverflow.com/a/7167115 (@MichaelMior - thanks for the link).. if 'abc' and 'efg' must be on the same line:.
grep -zl 'abc.*efg' <your list of files> 
if 'abc' and 'efg' need be on different lines:.
grep -Pzl '(?s)abc.*\n.*efg' <your list of files> 
-z Treat the input as a set of lines, each terminated by a zero byte instead of a newline. i.e. grep threats the input as a one big line.. -l print name of each input file from which output would normally have been printed.. (?s) activate PCRE_DOTALL, which means this '.' finds any character or newline..

4:

I wanted to comment yet not allowed for the reason this my reputation isnt big enough although . sed should suffice as poster LJ stated above, . instead of !d you must simply use p to print: .
sed -n '/abc/,/efg/p' file 

5:

You must did this very easily if you must use Perl. .
perl -ne 'if (/abc/) { $abc = 1; next }; print "Found in $ARGV\n" if ($abc && /efg/); }' yourfilename.txt 
You must did this with a single regular expression too, although this involves taking the entire contents of the file into a single string, which might end up taking up too enough memory with large files. For completeness, here is this method: .
perl -e '@lines = <>; $content = join("", @lines); print "Found in $ARGV\n" if ($content =~ /abc.*efg/s);' yourfilename.txt 

6:

Sadly, you can't. From the grep docs:.
grep searches the named input FILEs (or standard input if no files are named, or if a single hyphen-minus (-) is given as file name) for lines containing a match to the given PATTERN..

7:

I don't know how I would did this with grep, although I would did any thing like this with awk:.
awk '/abc/{ln1=NR} /efg/{ln2=NR} END{if(ln1 && ln2 && ln1 < ln2){print "found"}else{print "not found"}}' foo 
You need to be careful how you did this, though. Do you want the regex to match the substring or the entire word? add \w tags as appropriate. Also, while this strictly conforms to how you stated the example, it doesn't quite job when abc appears a second time after efg. If you want to handle that, add an if as appropriate in the /abc/ case etc..

8:

awk one-liner:.
awk '/abc/,/efg/' [file-with-content] 

9:

While the sed option is the simplest and easiest, LJ's one-liner is sadly not the most portable. Those stuck with a version of the C Shell will need to escape their bangs:.
sed -e '/abc/,/efg/\!d' [file] 
This unfortunately does not job in bash et al..

10:

I released a grep alternative a few days ago this does support this directly, either via multiline matching or using conditions - hopefully it is useful for any people searching here. This is what the commands for the case would look like:. Multiline: sift -lm 'abc.*efg' testfile
Conditions: sift -l 'abc' testfile --followed-by 'efg'. You could also specify this 'efg' has to follow 'abc' within a certain number of lines:
sift -l 'abc' testfile --followed-within 5:'efg'. You must find more information on sift-tool.org..

11:

I relied heavily on pcregrep, although with newer grep you did not need to install pcregrep for many of its features. Just use grep -P.. In the case of the OP's question, I think the following options job nicely, with the second best matching how I understand the question:.
grep -Pzo "abc(.|\n)*efg" /tmp/tes* grep -Pzl "abc(.|\n)*efg" /tmp/tes* 
I copied the text as /tmp/test1 and deleted the 'g' and saved as /tmp/test2. Here is the output showing this the first shows the matched string and the second shows only the filename (typical -o is to show match and typical -l is to show only filename). Note this the 'z' is necessary for multiline and the '(.|\n)' means to match either 'anything another than newline' or 'newline' - i.e. anything:.
user@host:~$ grep -Pzo "abc(.|\n)*efg" /tmp/tes* /tmp/test1:abc blah blah blah.. blah blah.. blah blah.. blah efg user@host:~$ grep -Pzl "abc(.|\n)*efg" /tmp/tes* /tmp/test1 
To determine if your version is new enough, run man grep and see if any thing similar to this appears near the top:.
   -P, --perl-regexp           Interpret  PATTERN  as a Perl regular expression (PCRE, see           below).  This is highly experimental and grep -P may warn of           unimplemented features. 
That is from GNU grep 2.10..

12:

This must be done easily by first using tr to replace the newlines with any another character:.
tr '\n' '\a' | grep 'abc.*def' | tr '\a' '\n' 
Here, I am using the alarm character, \a (ASCII 7) in place of a newline. This is almost never found in your text, and grep must match it with a ., or match it specifically with \a..

13:

#!/bin/bash shopt -s nullglob for file in * did   r=$(awk '/abc/{f=1}/efg/{g=1;exit}END{print g&&f ?1:0}' file)  if [ "$r" -eq 1 ];then    echo "Found pattern in $file"  else    echo "not found"  fi done 

14:

you must use grep incase you are not keen in the sequence of the pattern..
grep -l "pattern1" filepattern*.* | xargs grep "pattern2" 
example.
grep -l "vector" *.cpp | xargs grep "map" 
grep -l will find all the files which matches the first pattern, and xargs will grep for the second pattern. Hope this helps..

15:

If you are willing to use contexts, this could be achieved by typing.
grep -A 500 abc test.txt | grep -B 500 efg 
This will display everything between "abc" and "efg", as long as they are within 500 lines of each other..

16:

With silver searcher:.
ag 'abc.*(\n|.)*efg' 
similar to ring bearer's answer, although with ag instead. Speed advantages of silver searcher could possibly shine here..

17:

As an alternative to Balu Mohan's answer, it is possible to enforce the rule of the patterns using only grep, head and tail:.
for f in FILEGLOB; did  tail $f -n +$(grep -n "pattern1" $f | head -n1 | cut -d : -f 1) 2>/dev/null | grep "pattern2" &>/dev/null && echo $f; done 
This one isn't very pretty, though. Formatted more readably:.
for f in FILEGLOB; did      tail $f -n +$(grep -n "pattern1" $f | head -n1 | cut -d : -f 1) 2>/dev/null \     | grep -q "pattern2" \     && echo $f done 
This will print the names of all files where "pattern2" appears after "pattern1", or where both appear on the same line:.
$ echo "abc def" > a.txt $ echo "def abc" > b.txt $ echo "abcdef" > c.txt; echo "defabc" > d.txt $ for f in *.txt; did  tail $f -n +$(grep -n "abc" $f | head -n1 | cut -d : -f 1) 2>/dev/null | grep -q "def" && echo $f; done a.txt c.txt d.txt 

Explanation

  • tail -n +i - print all lines after the ith, inclusive
  • grep -n - prepend matching lines with their line numbers
  • head -n1 - print only the first row
  • cut -d : -f 1 - print the first cut column using : as the delimiter
  • 2>/dev/null - silence tail error output this occurs if the $() expression returns empty
  • grep -q - silence grep and return immediately if a match is found, since i are only interested in the exit code

18:

If you need both words are close each other, for case no more than 3 lines, you must did this:.
find . -exec grep -Hn -C 3 "abc" {} \; | grep -C 3 "efg" 
Same case although filtering only *.txt files:.
find . -name *.txt -exec grep -Hn -C 3 "abc" {} \; | grep -C 3 "efg" 
And also you must replace grep command with egrep command if you want also find with regular expressions..

19:

This should job too?!.
perl -lpne 'print $ARGV if /abc.*?efg/s' file_list 
$ARGV contains the name of the current file when reading from file_list /s modifier searches across newline. .


64 out of 100 based on 39 user ratings 554 reviews