Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Feed me to the lions

Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:

> __/ [John Bokma] on Wednesday 07 December 2005 06:00 \__
>> Perl script that automatically creates a feed:
>> http://johnbokma.com/perl/rss-web-feed-builder.html
>> feedback (you can post on my site) and back links are welcome (as
>> always). 
> Okay,  I  was  seriously about to set this up. I wanted  a 
> recently-added static pages feed (I notice that all of yours a
> static). 
> A random few thoughts:
> -With  large sites, the scripts hogs resources for a very long time,
> which becomes problematic on a shared server. 'nice'ing it might help.
> I suppose one could use this on small sites with fewer worries.
> Advanced stats pack- ages are just about as resource-greedy.

I will look into that.

> -Although I can install the XML-RSS Perl module, I cannot install it
> on my host's machines. I can use phpshell to get shell access, but no
> root priv- ileges.

Ah, but you can install the module locally. I will add a how to.

> -The script is a neat one to put on the Web server and define it to be
> run nightly  using  the  crontab.

Yup, that was one of the ideas, either that, or run it on the local 
machine. My site is created on my local machine, and only new pages are 
uploaded, which might include the index.rss.

> Again, there is the  issue  of 
> scalability. cPanel  allows  cron jobs to be defined without the need 
> for  workarounds such as shell/SSH/telnet 'deprivation', which is
> often set by default. 
> -The  script seems very customisable and simple enough to tailor to 
> one's needs.

Yup. I am open for suggestions :-D.

> -if  the  script  cannot be automated to run  independently,  it's 
> almost pointless.

Not at all, see above. If one has a local copy of the website, it can be 
run locally. However, this means that one shouldn't have some program 
that re-generates all files on the website no matter if they have new 
contents or not.

> I  noticed that you periodically modify your XML
> files,  which makes update non-real-time.

I run a mkfeed.pl script manually. I make my site as follows:

I have a bunch of XML files, and a perl script that parses those files, 
and generates HTML files from it (not 1:1, one XML file can generate 
several HTML files). The generation of HTML is done in memory. For each 
HTML page, the in memory version is compared with a version of a 
previous run, if there is a change, the in-memory version is written to 
disk. So when the program finishes, it gives a nice overview of which 
pages are new.

A second script compares each html file in the web directory with a copy 
of the actual website. If there is a change, the new file is uploaded, 
and added to the local copy :-D. So only files that are new/changed are 

The mkfeed script just parses a text file, and creates a feed based on 
the info in it. What I do is I add an URL to the end of the file for 
each new item. The script extracts the info similar to the feed-builder 
script, and creates the feed :-)

> All in all, the script would suit many people, but it does not
> accommodate my   circumstances.

And the second one? (i.e. you copy URLs to a txt file)

> I  guess  I  could  run  the  Perl  
> from   here   ( http://baine.smb.man.ac.uk:8001  ), but putting my
> feeds off-site would be similar to 'offshoring' feeds to FeedBurner.
> it also devours bandwidth.

The main reason is XML::RSS or the resource usage when the HTML files 
are scanned?

> Roy
> PS - I like scripts that are practical such as ones that serve as SEO
> tools and automated site 'housekeeping'. Keep em' coming!

Thanks, suggestions are very welcome for new tools :-D

John                       Perl SEO tools: http://johnbokma.com/perl/
                                             or have them custom made
                 Experienced (web) developer: http://castleamber.com/

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index