egghelp.org community Forum Index
[ egghelp.org home | forum home ]
egghelp.org community
Discussion of eggdrop bots, shell accounts and tcl scripts.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

links - url and img collector

 
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases
View previous topic :: View next topic  
Author Message
ngtjah
Voice


Joined: 30 Mar 2014
Posts: 5

PostPosted: Mon Mar 31, 2014 12:02 am    Post subject: links - url and img collector Reply with quote

A web page and scripts that collect and display links and images from a IRC channel with the help of an eggdrop bot's channel log.

https://github.com/ngtjah/links

It can do some other things like integrate with twitter and pocket.

This is a little project that I've been working on for a few years driven by my little IRC community and I thought this might be a good place to share it and get some feedback.

Check out the github wiki for more information.

-ngtjah
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Tue Jun 17, 2014 3:28 pm    Post subject: Works! Except for YouTube... Reply with quote

So thanks to your excellent guide I was able to follow the directions and, with a little bit of troubleshooting and only minimal Linux knowledge (running this on Debian), I was able to get this script mostly working.

For some reason, however, it does not seem to like the YouTube links I tested it with, and seems to decide they are error pages.

Output of Links_logs.log

Code:

This URL: https://www.youtube.com/watch?v=HMUDVMiITOU&feature=kp doesn't really exist!! Return Code: 501

$VAR1 = bless( {
                 'isdupe' => 0,
                 'date' => '2014-06-17 14:10:31',
                 'parseline' => '[14:10:31:2014-06-17] <Nocty> https://www.youtube.com/watch?v=HMUDVMiITOU&feature=kp',
                 'body' => 'https://www.youtube.com/watch?v=HMUDVMiITOU&feature=kp',
                 'mimetype_returncode' => 501,
                 'type' => 'irc',
                 'announcer' => 'Nocty',
                 'mimetype' => 'text/plain',
                 'www_url' => 'https://www.youtube.com/watch?v=HMUDVMiITOU&feature=kp'
               }, 'Links' );
Back to top
View user's profile Send private message
ngtjah
Voice


Joined: 30 Mar 2014
Posts: 5

PostPosted: Mon Jul 21, 2014 10:59 pm    Post subject: Reply with quote

Hi Nocty, Glad you were able to get it working! I was hoping I had not missed anything in the guide. You are the first user I have heard back from since releasing this. Thanks for the feedback.

I will take a look at this problem. I have seen some issues like this before with a 5xx error on my site. I'm thinking something must be going on with the "feature=kp" in the URL. Does it work OK without that?

If you have any more issues please feel free to report them on the github page under issues and I should get back to you a bit faster.

-ngtjah
Back to top
View user's profile Send private message
ngtjah
Voice


Joined: 30 Mar 2014
Posts: 5

PostPosted: Wed Jul 23, 2014 3:26 pm    Post subject: Re: Works! Except for YouTube... Reply with quote

Hey Nocty,

I found the issue. Some of my recent code didn't make it to the repository. If you grab the new Links.pm file you should be good to go.

-ngtjah
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Thu Jul 24, 2014 11:24 am    Post subject: Thanks! Reply with quote

:D

Awesome, will grab to test now, I had been looking to replace the current link logger I have because it's ugly as sin, but as you might imagine YouTube links are pretty common in the channel.

I will grab the new .pm and let you know how things work, but I'm pretty sure you're correct; I noticed that it was indeed catching some YT links but not all of them.

Thanks again for all your hard work!
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Thu Jul 24, 2014 11:49 am    Post subject: Same error Reply with quote

Still getting the same error with the updated links.pm; it seems to not like anything as far as the & parameter in the URL

Code:
 This URL: https://www.youtube.com/watch?v=HMUDVMiITOU&t=10 doesn't really exist!! Return Code: 500

$VAR1 = bless( {
                 'isdupe' => 0,
                 'date' => '2014-07-24 11:30:46',
                 'parseline' => '[11:30:46:2014-07-24] <@Nocty> https://www.youtube.com/watch?v=HMUDVMiITOU&t=10',
                 'body' => 'https://www.youtube.com/watch?v=HMUDVMiITOU&t=10',
                 'mimetype_returncode' => 500,
                 'type' => 'irc',
                 'announcer' => 'Nocty',
                 'mimetype' => 'text/plain',
                 'www_url' => 'https://www.youtube.com/watch?v=HMUDVMiITOU&t=10'
               }, 'Links' );


You could potentially strip anything starting at the first & in a YouTube URL for the parsing, since these options in the URL only instruct the browser to skip to a particular point in the video, enable HD, etc, and wouldn't adversely affect the video title being parsed.

It actually might be a good thing to do regardless, since this would also make it so that

https://www.youtube.com/watch?v=HMUDVMiITOU&t=10 (skipping to 10 seconds in)
and
https://www.youtube.com/watch?v=HMUDVMiITOU&t=20 (skipping to 20 seconds in)

would both be truncated as

https://www.youtube.com/watch?v=HMUDVMiITOU

And would not result in multiple entries for the same video, but different timestamps, being entered in the DB.
Back to top
View user's profile Send private message
ngtjah
Voice


Joined: 30 Mar 2014
Posts: 5

PostPosted: Thu Jul 24, 2014 8:32 pm    Post subject: Reply with quote

Interesting...
I can't seem to replicate the issue on my system...possibly some differences in our perl modules... I do see that now you are receiving a 500 error now, where before it was a 501. 500 is less specific an error than 501 so that doesn't really help... hmmm..

Could it be that it works with http and not https?

Would you also paste the full log from this entry starting from "initializing links object"? Can you show me the log from a youtube that does work as well?

I could create an option to strip the URL paramaters like you suggested, but lets make sure we know where the issue is first.

thanks!
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Fri Jul 25, 2014 8:46 am    Post subject: You're onto something Reply with quote

Actually it looks like you're definitely onto something; it would appear that none of my HTTPS links are parsing correctly.

Code:
 Initialize Links Object
Checking existance of site in database..
not dupe
This URL: https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg doesn't really exist!! Return Code: 501

$VAR1 = bless( {
                 'isdupe' => 0,
                 'date' => '2014-07-18 03:01:17',
                 'parseline' => '[03:01:17:2014-07-18] <Frank> https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg btfo',
                 'body' => 'https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg btfo',
                 'mimetype_returncode' => 501,
                 'type' => 'irc',
                 'announcer' => 'Frank',
                 'mimetype' => 'text/plain',
                 'www_url' => 'https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg'
               }, 'Links' );
0 entries added/updated to the database.
done.


as far as my versions

Version: 5.836-1 (libwww-perl)
Version: 1.30-1 (libmime-types-perl)
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Fri Jul 25, 2014 11:06 am    Post subject: Perplexing - EDIT: FIXED! Reply with quote

Yeah something is definitely not right on my end, I wrote a simple Perl script to compare MIME type responses based on this:

http://stackoverflow.com/questions/523773/how-do-i-find-a-links-content-type-in-perl

and it is returning different types for the same link for HTTP vs HTTPS

Code:
user@AALurker:~/links$ perl test.pl
Trying https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg
The type is text/plain
Trying http://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-xfp1/t1.0-9/1513695_314741348650880_1231833456_n.jpg
The type is image/jpeg


Any thoughts? I'm going to try installing LWP from something other than the Debian package manager.

EDIT: Fixed it! Razz

I think the version of LWP installed via Debian's software center was too old to support HTTPS or didn't include

http://search.cpan.org/~mschilli/LWP-Protocol-https-6.06/

as a caveat, there was only a brief span of time where the HTTPS module was included by default, my version was too old, anything past 6.02 is too new because

Quote:
This module used to be bundled with the libwww-perl, but it was unbundled in v6.02 in order to be able to declare its dependencies properly for the CPAN tool-chain. Applications that need https support can just declare their dependency on LWP::Protocol::https and will no longer need to know what underlying modules to install.


I was able to resolve this issue by re-installing via the CPAN shell. Using sudo or at a root console:

Code:
root@AALurker: perl -MCPAN -eshell (may need to initialize, answer all questions with default answer)
cpan> install Bundle::LWP (again answer default or "yes" for all questions)
cpan> install LWP::Protocol::https (again answer default or "yes" for all questions)


Once I did the above, links_logs.log output was:

Code:
Initialize Links Object
Checking existance of site in database..
not dupe
Remote Server Mime Type: text/html
Title: Blaze Loves His Kennel (ORIGINAL) Husky Says No to Kennel - Funny - YouTube
Entering site...
MYSQL:INSERT INTO links (site, announcer, edate, type, title, filename, twidth, theight, width, height, appid) VALUES ('https://www.youtube.com/watch?v=hCRDskZrUMU', 'Nocty', '2014-07-25 10:39:46', 'irc', 'Blaze Loves His Kennel (ORIGINAL) Husky Says No to Kennel - Funny - YouTube', NULL, NULL, NULL, NULL, NULL, NULL)
Announcer : Nocty   URL : https://www.youtube.com/watch?v=hCRDskZrUMU

$VAR1 = bless( {
                 'isdupe' => 0,
                 'date' => '2014-07-25 10:39:46',
                 'parseline' => '[10:39:46:2014-07-25] <@Nocty> https://www.youtube.com/watch?v=hCRDskZrUMU',
                 'body' => 'https://www.youtube.com/watch?v=hCRDskZrUMU',
                 'www_img' => 'https://www.youtube.com/watch?v=hCRDskZrUMU',
                 'mimetype_returncode' => '200',
                 'title' => 'Blaze Loves His Kennel (ORIGINAL) Husky Says No to Kennel - Funny - YouTube',
                 'type' => 'irc',
                 'announcer' => 'Nocty',
                 'mimetype' => 'text/html',
                 'www_url' => 'https://www.youtube.com/watch?v=hCRDskZrUMU'
               }, 'Links' );
1 entry added/updated to the database.
done.
Back to top
View user's profile Send private message
ngtjah
Voice


Joined: 30 Mar 2014
Posts: 5

PostPosted: Fri Jul 25, 2014 11:46 am    Post subject: Reply with quote

NICE! Enjoy!
Back to top
View user's profile Send private message
Nocty
Voice


Joined: 17 Jun 2014
Posts: 15

PostPosted: Fri Jul 25, 2014 11:59 am    Post subject: Thanks! Reply with quote

Thanks a ton for your hard work, great script! I posted the same fix as a comment on your github in case anyone else is a Linux scrub like me and has the same issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    egghelp.org community Forum Index -> Script Support & Releases All times are GMT - 4 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Forum hosting provided by Reverse.net

Powered by phpBB © 2001, 2005 phpBB Group
subGreen style by ktauber