I need help with script to grab certain url from text file.
for example, I need 3 areas from domain (www.domain.com, domain.com,
m.domain.com & www.domainmask.com, domainmask.com, m.domainmask.com)
should be grabbed for example:
http://www.domain.com/profile/12345678
assigned domain variations: www.domain.com, domain.com, m.domain.com & www.domainmask.com, domainmask.com, m.domainmask.com should be
renamed to www.domain.com
this should be grabbed from banch of text combination like blah.com=http://www.domain.com/profile/12345678?anything=like12454
and after rename of domain variations and grabbing all url string, any duplication must be removed.
so following is text to grab
1. 2345 ewrtwert 345345 blah.com=http://www.domain.com/profile/12345678?anything=like12454 | sdass 12456 May
2. e2345 ewrtwert 345345 http://google.com?http://m.domain.com/profile/34345678&anything=like12454 | sdass 12456 May
3. wer345 ewrtwert 345345 http://google.com&http://m.domainmask.com/profile/45345678/anything=like12454 | sdass 12456 May
4. 2we345 ewrtwert 345345 http://google.com%http://www.domainmask.com/profile/qwe345678%anything=like12454 | sdass 12456 May
so output will be after grabbing and renaming domain & domain mask and removing duplicates:
http://www.domain.com/profile/12345678
http://www.domain.com/profile/34345678
http://www.domain.com/profile/45345678
http://www.domain.com/profile/qwe345678