Module Rss

module Rss: sig .. end

The RSS library to read and write RSS 2.0 files.

Reference: RSS 2.0 specification.


Types

type date = Netdate.t 
val string_of_date : ?fmt:string -> date -> string

Format a date/time record as a string according to the format string fmt.

fmt : The format string. It consists of zero or more conversion specifications and ordinary characters. All ordinary characters are kept as such in the final string. A conversion specification consists of the '%' character and one other character. See Netdate.format_to for more details. Default: "%d %b %Y".
type email = string 

can be, for example: foo@bar.com (Mr Foo Bar)

type pics_rating = string 
type skip_hours = int list 

0 .. 23

type skip_days = int list 

0 is Sunday, 1 is Monday, ...

type url = Neturl.url 
type category = {
   cat_name : string; (*

A forward-slash-separated string that identifies a hierarchic location in the indicated taxonomy.

*)
   cat_domain : url option; (*

Identifies a categorization taxonomy.

*)
}
type image = {
   image_url : url; (*

The URL of a GIF, JPEG or PNG image that represents the channel.

*)
   image_title : string; (*

Description of the image, it's used in the ALT attribute of the HTML <img> tag when the channel is rendered in HTML.

*)
   image_link : url; (*

The URL of the site, when the channel is rendered, the image is a link to the site. (Note, in practice the image_title and image_link should have the same value as the Rss.channel's ch_title and ch_link.)

*)
   image_height : int option; (*

Height of the image, in pixels.

*)
   image_width : int option; (*

Width of the image, in pixels.

*)
   image_desc : string option; (*

Text to be included in the "title" attribute of the link formed around the image in the HTML rendering.

*)
}
type text_input = {
   ti_title : string; (*

The label of the Submit button in the text input area.

*)
   ti_desc : string; (*

Explains the text input area.

*)
   ti_name : string; (*

The name of the text object in the text input area.

*)
   ti_link : url; (*

The URL of the CGI script that processes text input requests.

*)
}
type enclosure = {
   encl_url : url; (*

URL of the enclosure

*)
   encl_length : int; (*

size in bytes

*)
   encl_type : string; (*

MIME type

*)
}
type guid = 
| Guid_permalink of url (*

A permanent URL pointing to the story.

*)
| Guid_name of string (*

A string that uniquely identifies the item.

*)
type source = {
   src_name : string;
   src_url : url;
}
type cloud = {
   cloud_domain : string;
   cloud_port : int;
   cloud_path : string;
   cloud_register_procedure : string;
   cloud_protocol : string;
}

See specification

type 'a item_t = {
   item_title : string option; (*

Optional title

*)
   item_link : url option; (*

Optional link

*)
   item_desc : string option; (*

Optional description

*)
   item_pubdate : date option; (*

Date of publication

*)
   item_author : email option; (*

The email address of the author of the item.

*)
   item_categories : category list; (*

Categories for the item. See the field Rss.category.

*)
   item_comments : url option; (*

Url of comments about this item

*)
   item_enclosure : enclosure option;
   item_guid : guid option; (*

A globally unique identifier for the item.

*)
   item_source : source option;
   item_data : 'a option; (*

Additional data, since RSS can be extended with namespace-prefixed nodes.

*)
}

An item may represent a "story". Its description is a synopsis of the story (or sometimes the full story), and the link points to the full story.

type namespace = string * string 

A namespace is a pair (name, url).

type ('a, 'b) channel_t = {
   ch_title : string; (*

Mandatory. The name of the channel, for example the title of your web site.

*)
   ch_link : url; (*

Mandatory. The URL to the HTML website corresponding to the channel.

*)
   ch_desc : string; (*

Mandatory. A sentence describing the channel.

*)
   ch_language : string option; (*

Language of the news, e.g. "en". See the W3C language codes.

*)
   ch_copyright : string option; (*

Copyright notice.

*)
   ch_managing_editor : email option; (*

Managing editor of the news.

*)
   ch_webmaster : email option; (*

The address of the webmasterof the site.

*)
   ch_pubdate : date option; (*

Publication date of the channel.

*)
   ch_last_build_date : date option; (*

When the channel content changed for the last time.

*)
   ch_categories : category list; (*

Categories for the channel. See the field Rss.category.

*)
   ch_generator : string option; (*

The tool used to generate this channel.

*)
   ch_cloud : cloud option; (*

Allows processes to register with a cloud to be notified of updates to the channel.

*)
   ch_docs : url option; (*

An url to a RSS format reference.

*)
   ch_ttl : int option; (*

Time to live, in minutes. It indicates how long a channel can be cached before refreshing from the source.

*)
   ch_image : image option;
   ch_rating : pics_rating option; (*

The PICS rating for the channel.

*)
   ch_text_input : text_input option;
   ch_skip_hours : skip_hours option; (*

A hint for aggregators telling them which hours they can skip.

*)
   ch_skip_days : skip_days option; (*

A hint for aggregators telling them which days they can skip.

*)
   ch_items : 'b item_t list;
   ch_data : 'a option; (*

Additional data, since RSS can be extended with namespace-prefixed nodes.

*)
   ch_namespaces : namespace list;
}
type item = unit item_t 
type channel = (unit, unit) channel_t 

Building items and channels

val item : ?title:string ->
?link:url ->
?desc:string ->
?pubdate:date ->
?author:email ->
?cats:category list ->
?comments:url ->
?encl:enclosure ->
?guid:guid -> ?source:source -> ?data:'a -> unit -> 'a item_t

item() creates a new item with all fields set to None. Use the optional parameters to set fields.

val channel : title:string ->
link:url ->
desc:string ->
?language:string ->
?copyright:string ->
?managing_editor:email ->
?webmaster:email ->
?pubdate:date ->
?last_build_date:date ->
?cats:category list ->
?generator:string ->
?cloud:cloud ->
?docs:url ->
?ttl:int ->
?image:image ->
?rating:pics_rating ->
?text_input:text_input ->
?skip_hours:skip_hours ->
?skip_days:skip_days ->
?data:'a ->
?namespaces:namespace list ->
'b item_t list -> ('a, 'b) channel_t

channel items creates a new channel containing items. Other fields are set to None unless the corresponding optional parameter is used.

val compare_item : ?comp_data:('a -> 'a -> int) -> 'a item_t -> 'a item_t -> int
val copy_item : 'a item_t -> 'a item_t
val copy_channel : ('a, 'b) channel_t -> ('a, 'b) channel_t

Manipulating channels

val keep_n_items : int -> ('a, 'b) channel_t -> ('a, 'b) channel_t

keep_n_items n ch returns a copy of the channel, keeping only n items maximum.

val sort_items_by_date : 'a item_t list -> 'a item_t list

Sort items by date, older last.

val merge_channels : ('a, 'b) channel_t -> ('a, 'b) channel_t -> ('a, 'b) channel_t

merge_channels c1 c2 merges the given channels in a new channel, sorting items using Rss.sort_items_by_date. Channel information are copied from the first channel c1.

Reading channels

type xmltree = 
| E of Xmlm.tag * xmltree list
| D of string

This represents XML trees. Such XML trees are given to functions provided to read additional data from RSS channels and items.

val xml_of_source : Xmlm.source -> xmltree

Read an XML tree from a source.

exception Error of string

Use this exception to indicate an error is functions given to make_opts used to read additional data from prefixed XML nodes.

type ('a, 'b) opts 

Options used when reading source.

val make_opts : ?schemes:(string, Neturl.url_syntax) Stdlib.Hashtbl.t ->
?base_syntax:Neturl.url_syntax ->
?read_channel_data:(xmltree list -> 'a option) ->
?read_item_data:(xmltree list -> 'b option) -> unit -> ('a, 'b) opts

See Neturl documentation for schemes and base_syntax options. They are used to parse URLs.

read_channel_data : provides a way to read additional information from the subnodes of the channels. All these subnodes are prefixed by an expanded namespace.
read_item_data : is the equivalent of read_channel_data parameter but is called of each item with its prefixed subnodes.
val default_opts : (unit, unit) opts
val channel_t_of_file : ('a, 'b) opts -> string -> ('a, 'b) channel_t * string list

channel_[t_]of_X returns the parsed channel and a list of encountered errors. Note that only namespaces declared in the root not of the XML tree are added to ch_namespaces field.

val channel_t_of_string : ('a, 'b) opts -> string -> ('a, 'b) channel_t * string list
val channel_t_of_channel : ('a, 'b) opts ->
Stdlib.in_channel -> ('a, 'b) channel_t * string list
val channel_t_of_xmls : ('a, 'b) opts -> xmltree list -> ('a, 'b) channel_t * string list

Read a channel from XML trees. These trees correspond to nodes under the "channel" XML node of a reguler RSS document.

val channel_of_file : string -> channel * string list
val channel_of_string : string -> channel * string list
val channel_of_channel : Stdlib.in_channel -> channel * string list

Writing channels

type 'a data_printer = 'a -> xmltree list 
val print_channel : ?channel_data_printer:'a data_printer ->
?item_data_printer:'b data_printer ->
?indent:int ->
?date_fmt:string ->
?encoding:string -> Stdlib.Format.formatter -> ('a, 'b) channel_t -> unit
val print_file : ?channel_data_printer:'a data_printer ->
?item_data_printer:'b data_printer ->
?indent:int ->
?date_fmt:string ->
?encoding:string -> string -> ('a, 'b) channel_t -> unit