Compound Find (beginning user)


#1

Hi. I am working with a large volume of xml files and would like to use the find command to isolate files that have certain parameters. I see that I can use find for one word or phrase but can you also find multiple data points? For example I want to find “industry”= abc AND “country”=US.

Thanks for the help.


#2

You’d have to give more relation to between a file and those criteria.
Are you searching for any file that contain those ?
Will they be on the same line ?

What about the fact that file are xml ?
Are you searching for multiple tag inside a single file ?

You may be able to advance the project using a regex.
Be aware that the file in multiple file feature of atom don’t really like multiple-line regex.

industry\s*=\s*"abc"[^\>]*country\s*=\s*"US"


A more robust solution might be to load those file in a database or database like structure and make query on that. There’s also a few data mining environment that may help you like R or some in python. Even excel to some degree.

I know you said beginning, but to some extend data mining is a hard problem and it really depend on what you are looking for, how many files, how many results per files, how many of such query will you make. How well organized or dirty the files are etc. It may be easiest to write a program with a good xml library.

It may be borderline on the responsibility of a text editor


#3

Thank you. Agree with a lot of what you said and a colleague of mine has actually exported the xml files to excel so that they can be filtered easily. I just thought that since you can “Find in Project” that it would be super helpful to be able to do a compound type of find. Right now, the Find in Project will search all 1000 xml files that I have in Atom but only for one data element at a time. That’s good, but doesn’t allow me to look for combinations of data. Will use the excel solution for now for this particular piece. Thanks again.