Administrator Guide | StockManiac Manual | User Guide |
The sections below contain information on how certain things are (going to be) implemented in StockManiac and which standards are utilized, etc... its the "higher vision" above the code level details.
In general StockManiac is a three-tiered application consisting of a gui, a daemon and a database.
The database schema is maintained with MySQL Workbench and stored in
gui/tutorials/stockmaniac-base/schema/stockmaniac-schema.mwbThe structural schema
gui/install.d/mysql-structure.sqlcan be (almost) created by applying the workbench schema. Note that it still needs some records in order to run stockmaniac on the structure:
This chapter explains how StockManiac deals with document uploads.
documents are stored inside the database as a Binary Large OBject (BLOB)
it is supposed to be robust
document storage is spread over multiple tables to increase performance
a document does *always* belong to a user
database column size is hardcoded in order to avoid that the DBMS truncates the file content silently. Look in the code for details.
the upload code is quite inefficient in regards to memory consumption. Ie during development I tried to upload a 20 MByte file. To achieve that it was necessary to set:
(php.ini) memory_limit = 100M (my.cnf) max_allowed_packet = 45000000 (php.ini) post_max_size = 25M (php.ini) upload_max_filesize = 25Mthe first two are rather insane. So this area is subject to improvement. Here are a few items to check
base64 encoding might make it more robust?
Apache (meaning any HTTP server in this context) can serve files from the filesystem much faster than MySQL (meaning any RDBMS in this context) can read BLOBs out of some table and serve that to Apache, which will then serve it to the browser...
Despite the performance there are quite some advantages in having the documents inside the RDBMS. It ranges from easier backup to less logic in the application (no need to write a 'storage layer' to distribute the files across a directory structure to avoid to many files per directory. No need to find a way to ensure consistency between file storage and database...).
For these (and some other) reasons I decided to store all documents inside the RDBMS. If performance becomes a issue one time, the plan is to implement a cache feature on top of the database storage.
The cache is actually a area on the file system containing all documents as regular files. This way Apache can serve files directly and thus downloads become much faster.
Data consistency wont be an issue because the cache is read-only in the sense that new documents are never written directly to the cache. They should go into the RDBMS first. Then get stamped with a unique ID, which will be the basis for building the cache.
further requirements:
RSS feeds are integrated as a two-tier design. This came up because I want news lookups happen automatically in the background. Meaning:
Tier 1: MagpieRSS library (PHP)
Tier 2: XML::RSS library (Perl)
For that reason StockManiac will strip down news items to a few informations that are considered essential for doing what StockManiac does. Look at the NewsItem table design for details.
... really is a problem with RSS. Since there is no unique identifier for news items (versions 0.9x and 1.0 do not specify it, 2.0 does - but it is not required). For that reason StockManiac has to create a (more or less) unique string in order to identify a news item within a feed.
The NewsItem table provides a general 'Indentifier' field for this purpose. A identifier is a string that is either coming from the actual news source (ie a RSS guid or the atom:id field, ...) or it is generated somehow. The essential requirement is that the string is unique enough to identify a news item within its feed. Its purpose is to see whether we have fetched a new item already.
The Identifier is *not* used for anything other than that. News assignments, read/unread marks, etc... are implemented by utilizing the internal database IDs which are usually AUTO_INCREMENT values and therefore much more reliable.
Identifiers are supposed to be extensible in the future. Ie. we could use different hashing algorythms and/or store some entity from a external source in there. The 'IdentifierMethod' field is ment to indicate what exactly a identifier is. This way code can know how to deal with the identifier string.
Currently there is only 'md5' defined as identification method. Extensions should be added to the enum list.
StockManiac *should* support everything that MagpieRSS and XML::RSS can handle. However, while implementing I noticed that 7 out of 10 feeds are not conform with any of the existing RSS versions.
Sometimes there are fields missing, at other occasions fields are not formatted correctly. Ie pubDate should be a RFC822 formated string. I have seen all kinds of date strings: date with/without time, as 12h, as 24h, with/without timezone, etc... some feeds do not even mention a version.
So the current implementation is my-best-guess based on the feeds I tested with. There is a lot room for improvements.
StockManiac is calculating a MD5 hash for each news item in order to figure whether the item has been fetched already (see above). It is using the following fields:
link + pubDate + strlen(title)the fields are catenated in this order and then pushed through the MD5 algorythm. Note that:
Atom is currently not supported. It will be. I just havn't done it yet because all the feeds I am interested in are RSS formatted. It seems that atom is technically much nicer but is not used widely yet (?). Like many good things.
However, I'll probably implement it one time. You can increase the priority by requesting it or by submitting a patch - this will be even faster. :-)
The atom:id element should be stored in the Hash column, as it is per definition unique [enough to identify a news item]. I would suggest to add a new IdentifierMethod. Maybe 'atom:id' or just 'atom'.
The StockManiac plugin API is pretty simple. All plugins are stored in a directory named 'plugins/'. Each plugin is actually a sub directory below plugins/.
For fast detection each plugin should implement a file called 'init.php'. This file should do nothing but contain some meta informations that describe the plugin. It should look like this:
Note that the description string above has been put in a gettext() call. This is essential for localization.
In addition to that each plugin may have it's own settings. Please read README.settings for details.
Each plugin must mention one out of several possible types in it's init.php file. Here is a list of currently implemented types and their meaning.
plugins of this type will be 'merged' into StockManiacs Main Page. They can be a Editor, a Day Selection Bar, some Display or anything else that is useful in this position
It is not necessary to create a new stockmaniac object within a integrated plugin since the one created in index.php is available due to the include mechanism. To access it use:
do NOT implement HTML Headers and Footers!
Integrated plugins are not added automatically. Since StockManiac can not know where you want to see it. So you must call it's run.php file from within index.php and add it's template to index.tpl at your favorite position
The SelectDay plugin (FIXME: link) is a good example for integrated plugins
Plugins of the type 'editor' are integrated into StockManiacs Main Page in exactly the same way as 'integrated' (FIXME: link) plugins. Thus you can access the global StockManiac Object as well.
Editors are automatically added to the Editor Selection Bar (below each editor).
InfoDisplay (FIXME: link) is a good example for editor plugins
Plugins of this type are external pages that can not be merged into StockManiacs Main Page. Therefore they have to implement a complete HTML page with Header and Footer (of course the may use the provided header.tpl and footer.tpl templates)
External plugins can not access index.php's StockManiac object. So they _have_ to create a new object.
External plugins are automatically added to the Feature Selection Bar (on top of the Main Page). See ObjectManager (FIXME: link) as example for external plugins
External plugins have to implement a ACL check right at the beginning. That is as soon as the StockManiac object is available. The class can not do this right now because of the way things are designed/implemented.
ACL check sample code
Plugins of this type are detected by StockManiac just like any other plugins.
Hidden plugins are *never* automatically loaded and/or included anywhere. You can call them inside your code if you *know* they are there
Hidden plugins should never be offered to the user and might even be 'headless', that is they may do something without producing any visible output on the GUI.
Hidden plugins may or may not create their own object(s). Take the Migration plugin (FIXME: link) as example for hidden plugins
To avoid interferences between plugins it is important that each plugin sets unique variable names for its template(s). Interferences may occur especially with integrated or editor plugins since they are all merged together in index.tpl. So please follow this simple convention while writing plugins.
A variable that is assigned to a template should be named:
<plugin name>_<some meaningfull variable name>For example you have a plugin named 'SelectDay' and you like to have a variable 'week' availabe in your template. Then you should assign it that way:
There are basically two ways to execute a query:
(1) call a pre-defined query function via a stockmaniac or dbc object. Please check dbi.class.php for available queries. If you can not find what you need proceed with number (2) below.
(2) run a custom query via the dbc::sql() method ( which is inherited by the stockmaniac class)
Custom queries should be avoided if possible to keep the total number of SQL queries in the appliacation low (as of writing this there are 200+ queries already existing). Please also try to avoid queries that write data. Stockmaniac-base has a VO/DAO object layer (FIXME: link) which provides a store() method for each object which should be used to write records.
Store custom queries in a separate library in your plugin directory. They should *not* be in the logic part of the code. Especially if the query is getting longer and/or has many variables.
The library should be called 'sql.inc.php' and loaded via require_once() on top of your plugin code.
Each query itself should be defined as a function which may or may not take additional parameters. The function name should start with 'sql_' followed by the nature of the query. That is either 'select', 'insert', or 'update'. At the end should be a hint on what the query is actually doing.
For example:
No matter if custom query or predefined query, the result can always be accessed through the sql_get* methods. There is:
Custom functions should be stored in a separate query library in your plugin directory. They should be kept as general as possible (as functions always should be).
The library should be called 'lib.inc.php' and loaded via require_once() on top of your plugin code.
A good example might be the UserManager: the objective of this plugin is to provide a interface to handle accounts. Of course it would be neat to have the ability to change passwords. And it would be even nicer if StockManiac would suggest random generated passwords. The logic of how to generate a random password is something that is important but not core logic of how to manage users. Thus it has been sourced out as create_password() and is stored in the lib.inc.php file.
This feature can be used in Editor plugins where the user submitted some data. For example Dividend payments or a new Order. StockManiac will automatically process the POST data and then call the redirect target. This is usually the corresponding Display. Thus, when data was submitted in the DividendEditor, the user will be redirected to the DividendDisplay. So he can see the changes immediatelly.
To use this in your plugin code simply set the redirect target plugin as a hidden input field. Like:
<input name="redirect" value="DividendDisplay" />On submit StockManiac will try to run and display this plugin right after the plugin that set the redirect target has been called.
Please use the stockmaniac::message() method for status messages you want the user to see. For example:
Note that all strings that will be shown to the user should be enclosed into a http://php.net/gettext() call (or _() for short). This is very essential for proper localization.
StockManiac will take care that everything is shown. You can call message() as often you like. All messages will be queued and printed the next time the page reloads.
If your code discovers a non-major error or you want to tell something important to the user please use the stockmaniac::error() method to do so. For example:
This will print your message to the user. In addition it tries to print all errors that may have been collected until that point. Finally it stops StockManiac right away.
Your code, as well as anything else, wont be able to continue any further. So please be carefull with this. It works like:
Purpose of the settings API is to provide a uniform way to store and read back all kinds of settings inside StockManiac. These are user specific account settings as well as plugin related preferences.
The Settings plugin is responsible for presenting available settings to the user and allow him to modify things. Of course by limiting it to a (predefined) set of possible values and with respect to ACLs.
Each plugin may define any number of settings. It should be done in the plugins 'init.php' file and look like:
Inside the plugin code settings can be accessed via the StockManiac object. Just like:
The get_setting methode will also accept the setting type ('plugin' or 'account') as second parameter. While 'plugin' is assumed as default type. As third parameter it accepts the name of a plugin. If the third paramenter is omitted, it will assume that $SM->EDITOR is the currently running plugin and use this name to locate the setting. For example:
To apply changes the user should always go through the 'Settings' plugin. As this is supposed to be the centralized place. Plugins itself should _not_ implement anything that allows a user to change preferences.
Account Settings are defined in account_defaults.inc.php and are structured similar as plugin specific settings. As a consequence they are stored similar: in the stockmaniac::$_SETTINGS array. Just like the plugin specific ones.
account_defaults.inc.php defines reasonable defaults for all users. They can be changed by each user individual through the settings plugin.
StockManiac will load the defaults as well as changes (from database) during the login phase. The resulting array is then cached inside /each/ session.
Account settings may accessed via stockmaniac::get_setting() as well.
Each settings record must be of a certain type. The type does actually regulate how the option field is interpreted by StockManiac. And how the Setting plugin will present it in the menu.
There is a array with all settings maintained inside the StockManiac object. This array is supposed to be private. While the get_setting* methods expose informations out of it. Usually for use inside a plugin. The array is structured like this:
The entire array is constructed in three steps during the plugin initialization phase:
Please read schema.png (FIXME: link) for details on the Setting table design.
The themes implemenation is based on Eric Meyers book "CSS - The Definitive Guide". If you have not read this book I strongly recommend to do so *before* creating or modifying themes for StockManiac.
The goal is to use CSS and XHTML technologies the way they are designed and implemented in good (!) browsers. This is a slight contrast to, say, 95% of all websites which are actually abusing these techniques. Most of the time it results in bloat which is a) hard to read and b) even harder to maintain, while c) being inflexible at any time.
I believe that CSS/XHTML/XML are very well thought through technologies which can do everything we ever want. Provided, we use them in the manner they have been designed by their inventors. For instance, in theory, CSS says its possible to provide many styles for one and the same document. That way it can be prepared for use on different media and/or devices. That is for printing on A4 or US-Letter paper, for viewing on PDAs and cellphones, for big screens and small screens, for Trekkies or even for blind people who rely on braille or speach synthesizer devices.
Even though not all of this is used in the beginning (and perhaps never will be), StockManiacs general design is made with all this in the back of the mind. It is supposed to keep all those possibilities open. Therefore please obey the following rules when creating themes:
Themes reside in themes/ directory
separate CSS styles into multiple files
utilize alternate style sheets
All plugins are surrounded by div tags like
<div id="plugin_<pluginname>"> ... </div><pluginname> is taken automatically from init.php. It must be lowercase and do not the 'plugin_' prefix.
More complex plugins may (in addition to id="plugin_<pluginname>") set a class to indicate which template is currently shown. The classname must be lowercase and be prefixed with 'template_'.
For instance the complete div tag surrounding the Asset plugin while the TimeTags template is in use would look like this:
<div id="plugin_asset" class="template_timetags"> ... </div>
Various items with "special" IDs exists as well, see the list below. Usually these are items of general nature which are only allowed to appear once per page... thus the ID. Items are prefixed with 'item_'. There is, for instance, one item on the main page called 'item_editor'. It is to group the editors head, the editor itself and the bar to select other editors since all three are logically related. The 'item_main' is quite similar.
Classes are used to identify "objects". For instance a news item, a document, a result table or a selection bar. Objects are usually constructions of several nested tags where only the surrounding tag is marked with the class name. Use descendend selectors to format the inside.
there are a number of "special" classes that could be set by display logic in template code (in fact its sort of a extension to CSS pseudo elements). For instance if the user clicked a item in a list then it may put in a marker that allows to highlight the item. These classes have verbal names such as "selected". Please do *not* assign generic styles to them.
all CLASSES and IDs must be LOWERCASE. (this is currently conflicting with plugin names but can be easily solved by applying the 'lower' filter in smarty)
In general all HTML elements should be used the way they are supposed to be used. For example using <div> to emphasize a single word in a text *is* wrong. Inline elements such as <em>, <strong>, ... should be used for this purpose.
Try to avoid tags wherever possible. Less is better. CSS selectors are very powerful. Utilize them. Especially avoid nested <div> constructions. This *is* ugly. The goal is to have considerably less markup than content. StockManiac output should be seen as raw document text which can still be read by human beings.
Tables are *not* to be used for any formating or layout purposes. Do not put cellpadding, cellspacing, align, ... etc. attributes into tables. If using a table make sure that <caption>, <thead>, <tbody> and eventually <tfoot> are well defined.
Do not use tags and attributes deprecated in XHTML. Especially avoid all those formatting-only tags such as <font>, <b>, <i>, ... as all of this should be done in CSS only.
Do never put "style" attributes into any tag. Styles always go into CSS files. (There is actually one exception to this rule: the RiskPlugin uses style="..." to set the background color for the risk indicator. Thats because the actual color is chosen by ranking logic in the code. This should remain the one and only exception...)
Markup code must be structured in a meaningful way in order to get the full power out of CSS selectors. Therefore stick to a few convetions:
Special Class Names
<span class="item_commentbox_content">element holding the actual content. Then a
<span class="item_commentbox_marker">element should be used somewhere close to indicate the presence of a hidden content box, for example:
<a class="item_commentbox"> <span class="item_commentbox_content">this is hidden content</span> </a> <span class="item_commentbox_marker"><!-- hover here --></span>where the <a> can be any other tag as well.
Special ID Names
Unittesting is done with the SimpleTest framework (http://www.simpletest.org/). The framework will be shipped with StockManiac as external_package/, the testcases will ship as seperate tarball/rpm package. This will allow to verify correct behaviour of field installations.
The stockmaniac-base package provides two interfaces for unit testing: stockmaniac_webtestcase is to run web tests through the scriptable browser supplied by SimpleTest. stockmaniac_unittestcase is to run tests directly on classes or functions.
All else realted to tests resides in the gui/test/ directory.
To run the whole suite just access run_all.php through a browser (it can also run directly on the shell, have a look at the simpletest docs). To run partial tests check what other run_*.class.php files are available in test/.
The tests are run against a real database which is set up automatically by - right - a testcase. The file mysql-structure.sql will be used as testing schema. Connection details are specified in path.inc.class.php. (I am aware of SimpleTests "mock objects" but I also know that there have been many bugs because of discrepancies between code and schema and testdata vs. new code... etc... so I want to test the whole thing...).
Generally the unittesting in StockManiac is focused on testing end-user functionality with SimpleTest's WebTestCase and SimpleBrowser features (available via stockmaniac_webtestcase and stockmaniac_unittestcase). The reasoning behind this is simple: it will test all application layers, from the database via classes to display code to the template engine. That way we test large chucks of code at once with only a few simple tests (often as simple as sending a form and checking the resulting output).
To keep all the testcases organized I made up a few conventions:
testcase files should be named like this:
run_<desc>.class.php
suite_<desc>.class.php
test_<desc>.class.php
classtest_<classname>.class.php
webtest_<pluginname>.class.php
all other testcases (except webtest_<pluginname>.class.php files) should be written as needed, ie. when fixing a bug or refacturing something... etc...
as usual the first part of the filename (everything before the .class.php extension) must match with the class definition in the file. This is so the __autoload() mechanism will work.
Stockmaniacd is the daemon that is responsible mainly for retrieving quotes and news items. Since version v0.14.0 it also solves math expressions generated by the autofigures.
stockmaniacd is a multi-threaded daemon to handle Quote and News updates as well as math expression solving seemless in the background, without requiring user interaction and/or the gui part being present. All code is written in perl, using interpreter based iThreads.
The workflow inside the daemon is visualized in the following picture. Essentially it pushes workitem objects (FIXME: link) through its queues from thread to thread.
Each thread is implemented as its own class following this scheme:
Quote updates are handeled with finance-quote.sourceforge.net, News updates are done via XML::RSS and (later) XML::Atom. For math expression solving is Math::Expression::Evaluator being used.
Communication between GUI and Daemon is realized by utilizing the XML-RPC protocol as implemented in RPC::XML
in the future, stockmaniacd might do much more than just fetching data from the internet (ie email notifications on certian events, ... etc...)
users should be able see the current update status
users should be able to request a instant update
data retrieval in general
quote updates
NewsFeeds updates
configuration
communication with the web-application
status and logging facilities
security
speed
notifications (not yet implemented)
Why perl iThreads?
Why RPC::XML
Data Transfer between Threads
Signaling to control Threads
use StockManiac::Workitem; my $signal = StockManiac::Workitem->new('_SIGNAL_', SIGTERM, ...);
Scheduler Design
Dispatcher Design
Worker Design o each worker runs as thread - the number of worker threads of each kind is configurable - watch daemon status logs to fine-tune the numbers o receive workitem objects from Dispatcher - each workhorse class (Quote, News, ...) has a separate queue - workers block when the queue is empty o process workitem(s) - it is up to the workhorse to decide wheter to process *one* item at a time or to fetch 5, 10 (or more) items in a row and process them at once - the order in which workitems come in should not matter (!) o completed work is sent back to the dispatcher - through a queue - again: the order should not matter. o a worker must be stateless - we do *not* know if one worker will ever process the same workitem again, since the worker might be stopped or die or some other worker will receive the workitem... - therefore all information that is necessary for processing a workitem *must* be kept within the workitem (there are data storage methods provided by the StockManiac::Workitem class)
Listener Design o runs as thread o listen for XML-RPC requests from the GUI (or something else) o process the request, get answers by talking to all other parts of the daemon via the scheduler (StockManiac::Workitem objects as actually kind of a communication protocol too) o its a proper implementation which supports 'introspection' by default o a limitation is that the listener can process one request at a time. If two come in, one must wait. This appears to be no problem in real-life as answers are usually pretty quick.
There are two essential data structures used in stockmaniacd. One are WORKITEMs, the others are CATALOGs.
Workitems are small chunks of data (objects in fact). They (better: references to them) are passed through the queues from one thread to the other. The threads take them as input and do whatever they do with it's content, then hand them over to the next thread (queue)...
A single work item is a object created from StockManiac::Workitem. Like:
$workitem = StockManiac::Workitem->new( dest => 'quote', profession => 'stock', task => 'update', id => 23, lastprocessed => 1166178567, # unix timestamp frequency => 36200 # seconds ... );
$workitem->add_data( { 'somevar'=>0, 'someothervar'=>1, ... } ); $workitem->add_raw_data( { 'rawvar'=>5, 'otherrawvar'=>10, ... } );
The first should *only* be used with data that has undergone sanitiy checks. So everything returned by get_data() can be considered 'save'. While the 'raw' functions are supposed to handle no sanitized data. Ie structures as received from the internet. Raw data should always be considered 'insecure'.
Further workers, scheduler, dispatcher, ... may do things with workitems:
$workitem->is_due() # is this item due for processing? $workitem->start_processing() # this item is now 'in progress' $workitem->is_signal() # is this a signal or regular work? $workitem->set_status(1); # whatever we did succeeded $workitem->finished_processing() # whatever we did is done now ...
Catalogs are rather big and currently kept in the Scheduler only (that might change in the not-too-far future). Anyhow a CATALOG is a collection of workitem objects. For instance the Scheduler is using a catalog data to figure what must be done next (ie which workitems are due for processing).
Technically speaking a catalog is a 'array of workitem objects'. Typically it looks like:
@CATALOG_name = ( { $WORKITEM_1 }, { $WORKITEM_2 }, ... { $WORKITEM_n }, ... );
@CATALOG_stocks
Administrator Guide | StockManiac Manual | User Guide |
Documentation generated on Sun, 22 Aug 2010 11:20:10 +0200 by phpDocumentor 1.4.3