Thursday, November 01, 2007

Database Normalization And Design Techniques

One of the most important factors in dynamic web page development is database definition. If your tables are not set up properly, it can cause you a lot of headaches down the road when you have to perform miraculous SQL calls in your PHP code in order to extract the data you want. By understanding data relationships and the normalization of data, you will be better prepared to begin developing your application in PHP. ----
我们可以快速的开发一套系统,但是系统是否具有高可用性,是否支持大量用户的并发访问。前几天就有新闻,说是奥运门票的第二阶段销售开始,结果当天销售门票的网站就崩溃了,因为一下子有太多的用户去访问这个网站。 我想这种情况是可以预期的,谁都会知道抢购奥运门票的情形该有多火爆,所以应该是网站没有做好准备。 这里我主要关注系统的性能问题,即支持大并发量的用户访问,而所有决定系统性能的因素当中,数据库设计应该是最重要的,这是数据架构级别的问题,是决定性的,当然还有其他的因素,比如分布式系统的远程调用, 数据的缓存等等,这些也很重要。
我在网上找了一些资料,整理一下。
Basically, the Rules of Normalization are enforced by eliminating redundancy and inconsistent dependency in your table designs.

  1 引言
   数据库优化的目标无非是避免磁盘I/O瓶颈、减少CPU利用率和减少资源竞争。为了便于读者阅读和理解,笔者参阅了Sybase、Informix和 Oracle等大型数据库系统参考资料,基于多年的工程实践经验,从基本表设计、扩展设计和数据库表对象放置等角度进行讨论,着重讨论了如何避免磁盘 I/O瓶颈和减少资源竞争,相信读者会一目了然。

  2 基于第三范式的基本表设计

  在基于表驱动的信息管理系 统(MIS)中,基本表的设计规范是第三范式(3NF)。第三范式的基本特征是非主键属性只依赖于主键属性。基于第三范式的数据库表设计具有很多优点:一 是消除了冗余数据,节省了磁盘存储空间;二是有良好的数据完整性限制,即基于主外键的参照完整限制和基于主键的实体完整性限制,这使得数据容易维护,也容 易移植和更新;三是数据的可逆性好,在做连接(Join)查询或者合并表时不遗漏、也不重复;四是因消除了冗余数据(冗余列), 在查询(Select)时每个数据页存的数据行就多,这样就有效地减少了逻辑I/O,每个Cash存的页面就多,也减少物理I/O;五是对大多数事务 (Transaction)而言,运行性能好;六是物理设计(Physical Design)的机动性较大,能满足日益增长的用户需求。

  在基本表设计中,表的主键、外键、索引设计占有非常重要的地位,但系统设计人员往往只注重于满足用户要求,而没有从系统优化的高度来认识和重视它们。实际上,它们与系统的运行性能密切相关。现在从系统数据库优化角度讨论这些基本概念及其重要意义:

   (1)主键(Primary Key):主键被用于复杂的SQL语句时,频繁地在数据访问中被用到。一个表只有一个主键。主键应该有固定值(不能为Null或缺省值,要有相对稳定 性),不含代码信息,易访问。把常用(众所周知)的列作为主键才有意义。短主键最佳(小于25bytes),主键的长短影响索引的大小,索引的大小影响索 引页的大小,从而影响磁盘I/O。主键分为自然主键和人为主键。自然主键由实体的属性构成,自然主键可以是复合性的,在形成复合主键时,主键列不能太多, 复合主键使得Join*作复杂化、也增加了外键表的大小。人为主键是,在没有合适的自然属性键、或自然属性复杂或灵敏度高时,人为形成的。人为主键一般是 整型值(满足最小化要求),没有实际意义,也略微增加了表的大小;但减少了把它作为外键的表的大小。

OIDs Should Have No Business Meaning
A very critical issue that needs to be pointed out is that OIDs should have absolutely no business meaning
whatsoever. Nada. Zip. Zilch. Zero. Any column with a business meaning can potentially change, and if
there’s one thing that we learned over the years in the relational world it’s that it’s a fatal mistake to give
your keys meaning. If your users decide to change the business meaning, perhaps they want to add some
digits or make the number alphanumeric, you need to make changes to your database in every single spot
where you use that information. Anything that is used as a primary key in one table is virtually guaranteed
to be used in other tables as a foreign key. What should be a simple change, adding a digit to your
customer number, can be a huge maintenance nightmare. Yuck. In the relational database world, this OID
strategy is referred to as employing surrogate keys.


  (2)外键(Foreign Key):外键的作用是建立关系型数据库中表之间的关系(参照完整性),主键只能从独立的实体迁移到非独立的实体,成为后者的一个属性,被称为外键。

   (3)索引(Index):利用索引优化系统性能是显而易见的,对所有常用于查询中的Where子句的列和所有用于排序的列创建索引,可以避免整表扫描 或访问,在不改变表的物理结构的情况下,直接访问特定的数据列,这样减少数据存取时间;利用索引可以优化或排除耗时的分类*作;把数据分散到不同的页面 上,就分散了插入的数据;主键自动建立了唯一索引,因此唯一索引也能确保数据的唯一性(即实体完整性);索引码越小,定位就越直接;新建的索引效能最好, 因此定期更新索引非常必要。索引也有代价:有空间开销,建立它也要花费时间,在进行Insert、Delete和Update*作时,也有维护代价。索引 有两种:聚族索引和非聚族索引。一个表只能有一个聚族索引,可有多个非聚族索引。使用聚族索引查询数据要比使用非聚族索引快。在建索引前,应利用数据库系 统函数估算索引的大小。

  ① 聚族索引(Clustered Index):聚族索引的数据页按物理有序储存,占用空间小。选择策略是,被用于Where子句的列:包括范围查询、模糊查询或高度重复的列(连续磁盘扫 描);被用于连接Join*作的列;被用于Order by和Group by子句的列。聚族索引不利于插入*作,另外没有必要用主键建聚族索引。

   ② 非聚族索引(Nonclustered Index):与聚族索引相比,占用空间大,而且效率低。选择策略是,被用于Where子句的列:包括范围查询、模糊查询(在没有聚族索引时)、主键或外 键列、点(指针类)或小范围(返回的结果域小于整表数据的20%)查询;被用于连接Join*作的列、主键列(范围查询);被用于Order by和Group by子句的列;需要被覆盖的列。对只读表建多个非聚族索引有利。索引也有其弊端,一是创建索引要耗费时间,二是索引要占有大量磁盘空间,三是增加了维护代 价(在修改带索引的数据列时索引会减缓修改速度)。那么,在哪种情况下不建索引呢?对于小表(数据小于5页)、小到中表(不直接访问单行数据或结果集不用 排序)、单值域(返回值密集)、索引列值太长(大于20bitys)、容易变化的列、高度重复的列、Null值列,对没有被用于Where子语句和 Join查询的列都不能建索引。另外,对主要用于数据录入的,尽可能少建索引。当然,也要防止建立无效索引,当Where语句中多于5个条件时,维护索引 的开销大于索引的效益,这时,建立临时表存储有关数据更有效。

  批量导入数据时的注意事项:在实际应用中,大批量的计算(如电信话 单计费)用C语言程序做,这种基于主外键关系数据计算而得的批量数据(文本文件),可利用系统的自身功能函数(如Sybase的BCP命令)快速批量导 入,在导入数据库表时,可先删除相应库表的索引,这有利于加快导入速度,减少导入时间。在导入后再重建索引以便优化查询。

  (4) 锁:锁是并行处理的重要机制,能保持数据并发的一致性,即按事务进行处理;系统利用锁,保证数据完整性。因此,我们避免不了死锁,但在设计时可以充分考虑 如何避免长事务,减少排它锁时间,减少在事务中与用户的交互,杜绝让用户控制事务的长短;要避免批量数据同时执行,尤其是耗时并用到相同的数据表。锁的征 用:一个表同时只能有一个排它锁,一个用户用时,其它用户在等待。若用户数增加,则Server的性能下降,出现“假死”现象。如何避免死锁呢?从页级锁 到行级锁,减少了锁征用;给小表增加无效记录,从页级锁到行级锁没有影响,若在同一页内竞争有影响,可选择合适的聚族索引把数据分配到不同的页面;创建冗 余表;保持事务简短;同一批处理应该没有网络交互。

  (5)查询优化规则:在访问数据库表的数据(Access Data)时,要尽可能避免排序(Sort)、连接(Join)和相关子查询*作。经验告诉我们,在优化查询时,必须做到:
  ① 尽可能少的行;
  ② 避免排序或为尽可能少的行排序,若要做大量数据排序,最好将相关数据放在临时表中*作;用简单的键(列)排序,如整型或短字符串排序;
  ③ 避免表内的相关子查询;
  ④ 避免在Where子句中使用复杂的表达式或非起始的子字符串、用长字符串连接;
  ⑤ 在Where子句中多使用“与”(And)连接,少使用“或”(Or)连接;
  ⑥ 利用临时数据库。在查询多表、有多个连接、查询复杂、数据要过滤时,可以建临时表(索引)以减少I/O。但缺点是增加了空间开销。
除非每个列都有索引支持,否则在有连接的查询时分别找出两个动态索引,放在工作表中重新排序。

  3 基本表扩展设计
   基于第三范式设计的库表虽然有其优越性(见本文第一部分),然而在实际应用中有时不利于系统运行性能的优化:如需要部分数据时而要扫描整表,许多过程同 时竞争同一数据,反复用相同行计算相同的结果,过程从多表获取数据时引发大量的连接*作,当数据来源于多表时的连接*作;这都消耗了磁盘I/O和CPU时 间。

  尤其在遇到下列情形时,我们要对基本表进行扩展设计:许多过程要频繁访问一个表、子集数据访问、重复计算和冗余数据,有时用户要求一些过程优先或低的响应时间。

  如何避免这些不利因素呢?根据访问的频繁程度对相关表进行分割处理、存储冗余数据、存储衍生列、合并相关表处理,这些都是克服这些不利因素和优化系统运行的有效途径。

  3.1 分割表或储存冗余数据
  分割表分为水平分割表和垂直分割表两种。分割表增加了维护数据完整性的代价。
水 平分割表:一种是当多个过程频繁访问数据表的不同行时,水平分割表,并消除新表中的冗余数据列;若个别过程要访问整个数据,则要用连接*作,这也无妨分割 表;典型案例是电信话单按月分割存放。另一种是当主要过程要重复访问部分行时,最好将被重复访问的这些行单独形成子集表(冗余储存),这在不考虑磁盘空间 开销时显得十分重要;但在分割表以后,增加了维护难度,要用触发器立即更新、或存储过程或应用代码批量更新,这也会增加额外的磁盘I/O开销。

   垂直分割表(不破坏第三范式),一种是当多个过程频繁访问表的不同列时,可将表垂直分成几个表,减少磁盘I/O(每行的数据列少,每页存的数据行就多, 相应占用的页就少),更新时不必考虑锁,没有冗余数据。缺点是要在插入或删除数据时要考虑数据的完整性,用存储过程维护。另一种是当主要过程反复访问部分 列时,最好将这部分被频繁访问的列数据单独存为一个子集表(冗余储存),这在不考虑磁盘空间开销时显得十分重要;但这增加了重叠列的维护难度,要用触发器 立即更新、或存储过程或应用代码批量更新,这也会增加额外的磁盘I/O开销。垂直分割表可以达到最大化利用Cache的目的。

  总之,为主要过程分割表的方法适用于:各个过程需要表的不联结的子集,各个过程需要表的子集,访问频率高的主要过程不需要整表。在主要的、频繁访问的主表需要表的子集而其它主要频繁访问的过程需要整表时则产生冗余子集表。
注意,在分割表以后,要考虑重新建立索引。

  3.2 存储衍生数据
  对一些要做大量重复性计算的过程而言,若重复计算过程得到的结果相同(源列数据稳定,因此计算结果也不变),或计算牵扯多行数据需额外的磁盘I/O开销,或计算复杂需要大量的CPU时间,就考虑存储计算结果(冗余储存)。现予以分类说明:
  若在一行内重复计算,就在表内增加列存储结果。但若参与计算的列被更新时,必须要用触发器更新这个新列。

  若对表按类进行重复计算,就增加新表(一般而言,存放类和结果两列就可以了)存储相关结果。但若参与计算的列被更新时,就必须要用触发器立即更新、或存储过程或应用代码批量更新这个新表。

  若对多行进行重复性计算(如排名次),就在表内增加列存储结果。但若参与计算的列被更新时,必须要用触发器或存储过程更新这个新列。

  总之,存储冗余数据有利于加快访问速度;但违反了第三范式,这会增加维护数据完整性的代价,必须用触发器立即更新、或存储过程或应用代码批量更新,以维护数据的完整性。

  3.3 消除昂贵结合
   对于频繁同时访问多表的一些主要过程,考虑在主表内存储冗余数据,即存储冗余列或衍生列(它不依赖于主键),但破坏了第三范式,也增加了维护难度。在源 表的相关列发生变化时,必须要用触发器或存储过程更新这个冗余列。当主要过程总同时访问两个表时可以合并表,这样可以减少磁盘I/O*作,但破坏了第三范 式,也增加了维护难度。对父子表和1:1关系表合并方法不同:合并父子表后,产生冗余表;合并1:1关系表后,在表内产生冗余数据。

  4 数据库对象的放置策略
  数据库对象的放置策略是均匀地把数据分布在系统的磁盘中,平衡I/O访问,避免I/O瓶颈。

  ⑴ 访问分散到不同的磁盘,即使用户数据尽可能跨越多个设备,多个I/O运转,避免I/O竞争,克服访问瓶颈;分别放置随机访问和连续访问数据。
  ⑵ 分离系统数据库I/O和应用数据库I/O。把系统审计表和临时库表放在不忙的磁盘上。
  ⑶ 把事务日志放在单独的磁盘上,减少磁盘I/O开销,这还有利于在障碍后恢复,提高了系统的安全性。
  ⑷ 把频繁访问的“活性”表放在不同的磁盘上;把频繁用的表、频繁做Join*作的表分别放在单独的磁盘上,甚至把把频繁访问的表的字段放在不同的磁盘上,把访问分散到不同的磁盘上,避免I/O争夺;
   ⑸ 利用段分离频繁访问的表及其索引(非聚族的)、分离文本和图像数据。段的目的是平衡I/O,避免瓶颈,增加吞吐量,实现并行扫描,提高并发度,最大化磁盘 的吞吐量。利用逻辑段功能,分别放置“活性”表及其非聚族索引以平衡I/O。当然最好利用系统的默认段。另外,利用段可以使备份和恢复数据更加灵活,使系 统授权更加灵活。



Resources:

Friday, August 10, 2007

在ubuntu下面配置subversion。

http://wiki.ubuntu.org.cn/SubVersion
这篇文章是ubuntu中文网站上的文档,跟着安装了一遍, 不错, 成功了。

Monday, June 18, 2007

硬盘安装ubuntu7.0.4

因为对6.10的那个ubuntu的启动画面相当不满,所以一直想升级到7.0.4,但是刚刚听别人说7.0.4的启动画面一样的老土,只好先停在这里,再等等吧。。。一个不幸的事情是我的笔记本光驱好像坏了,逼得我现在一门心思的再琢磨怎么从硬盘安装,在网上找了写资料,决定自己先记录下来,虽然现在不升级,正所谓居安思危,有备无患。
-----------------------------
1.从 http://releases.ubuntu.com/feisty/ 下载 ubuntu-7.04-alternate-i386.iso 并放到C:\,并且确认C:为FAT32分区 (这点就要命,我的window分区都是ntfs,好像是因为intz什么的不支持ntfs格式)
2.
下载 http://archive.ubuntu.com/ubuntu/dists/feisty/main/installer-i386/current/images/hd-media/ 里的文件,同样拷贝到C:\
initrd.gz
vmlinuz
3.有的文档说要执行这一步,有的没说,我也写在这里:
下载grub_for_dos-0.4.2,将里面的 grldr提取 复制到 C:\,编辑C:\BOOT.INI,加入一行代码:C:\GRLDR=”GRUB”
4.启动到grub,出现菜单时按下C键,进入grub的命令行模式,输入如下命令,即可启动安装程序:
grub> kernel (hd0,0)/vmlinuz root=/dev/ram ramdisk_size=256000 devfs=mount,dall
grub> initrd (hd0,0)/initrd.gz
grub> boot
天,不知道/dev/ram是什么意思? 为什么不是/dev/hd0呢?
理论上,就应该可以看到安装界面了!

Tuesday, May 29, 2007

Asynchronous calls and remote callbacks using Lingo Spring Remoting

很不错的技术文章,至少可以帮你澄清远程对象引用(传值/传引用),以及一些漂亮的编程技巧。
我觉得最酷的还是lingo居然可以将一个interface中的方法有的暴露成同步有的暴露成异步。 但是在这篇文章的示例代码中,我没有搞清楚他是怎么定义solve是异步,而cancel和registerXX是同步的?

请参考:
http://jroller.com/page/sjivan?entry=asynchronous_calls_and_callbacks_using

As mentioned in my previous blog entry, Lingo is the only Spring Remoting implementation that supports asynchronous calls and remote callbacks. Today I'll cover all the nitty gritty details of the async/callback related functionality along with the limitations and gotchas.

Asynchronous method invocation and callback support by Lingo is an awesome feature and there are several usecases where these are an absolute must. Lets consider a simple and rather common use case : You have a server side application (say an optimizer) for which you want you write a remote client API. The API has methods like solve() which are long running and methods like cancel() which stops the optimizer solve.

A synchronous API under such circumstances is not really suitable since the solve() method could take a really long time to complete. It could be implemented by having the client code spawn their own thread and do its own exception management but this becomes really kludgy. Plus you have to worry out network timeout issues. You might be thinking "I'll just use JMS if I need an asynchronous programming model". You could use JMS but think about the API you're exposing. Its going to be a generic JMS API where the client is registering JMS listeners, and sending messages to JMS destinations using the JMS API. Compare this to a remote API where the client is actually working with the Service interface itself.

Lingo combines the elegance of Spring Remoting with the ability to make asynchronous calls. Lets continue with our Optimizer example and implement a solution using Lingo and Spring. OptimizerService interface

public interface OptimizerService {
void registerCallback(OptimizerCallback callback) throws OptimizerException;

void solve();

void cancel() throws OptimizerException;
}

The solve() method is asynchronous while the cancel() and registerCallback(..) methods are not. Asynchronous methods by convention must not have a return value and also must not throw exceptions. The registerCallback(..) method registers a client callback with the Optimizer. In order to make an argument be a remote callback, the argument must implement java.util.EventListener or java.rmi.Remote. In this example the OptimizerCallback interface extends java.util.EventListener. If the argument does not implement either of these interfaces, it must implement java.io.Serializable and it will then be passed by value.

OptimizerCallback interface

public interface OptimizerCallback extends EventListener {

void setPercentageComplete(int pct);

void error(OptimizerException ex);

void solveComplete(float solution);
}

The callback API has a method for the Optimizer to set the percentage complete, report an error during the solve() process (remember that the solve() method is asynchronous so it cannot throw an exception directly) and finally the solveComplete(..) callback to inform the client that the solve is complete along with the solution.

OptimizerService implementation

public class OptimizerServiceImpl implements OptimizerService {

private OptimizerCallback callback;
private volatile boolean cancelled = false;


private static Log LOG = LogFactory.getLog(OptimizerServiceImpl.class);

public void registerCallback(OptimizerCallback callback) {
LOG.info("registerCallback() called ...");
this.callback = callback;
}


public void solve() {
LOG.info("solve() called ...");
float currentSolution = 0;

//simulate long running solve process
for (int i = 1; i <= 100; i++) { try {
currentSolution += i;
Thread.sleep(1000);
if (callback != null) {
callback.setPercentageComplete(i);
}
if (cancelled) {
break;
}
} catch (InterruptedException e) {
System.err.println(e.getMessage());
}
}
callback.solveComplete(currentSolution);


}

public void cancel() throws OptimizerException {
LOG.info("cancel() called ...");
cancelled = true;
}
}

The solve() method sleeps for a while and makes the call setPercentageComplete(..) on the callback registered by the client. The code is pretty self explanatory here.

Optimizer Application context - optimizerContext.xmlWe now need to export this service using Lingo Spring Remoting. The typical Lingo Spring configuration as described in the Lingo docs and samples is :

xml version="1.0" encoding="UTF-8"?>


<beans>
<bean id="optimizerServiceImpl" class="org.sanjiv.lingo.server.OptimizerServiceImpl" singleton="true"/>

<bean id="optimizerServer" class="org.logicblaze.lingo.jms.JmsServiceExporter" singleton="true">
<property name="destination" ref="optimizerDestination"/>
<property name="service" ref="optimizerServiceImpl"/>
<property name="serviceInterface" value="org.sanjiv.lingo.common.OptimizerService"/>
<property name="connectionFactory" ref="jmsFactory"/>
bean>


<bean id="jmsFactory" class="org.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="tcp://localhost:61616"/>
<property name="useEmbeddedBroker">
<value>truevalue>
property>
bean>

<bean id="optimizerDestination" class="org.activemq.message.ActiveMQQueue">
<constructor-arg index="0" value="optimizerDestinationQ"/>
bean>
beans>

In this example, I'm embedding a JMS broker in the Optimizer process. However you are free to use an external JMS broker and change the JMS Connection Factory configuration appropriately.

Note : The above optimizerContext.xml it the typical configuration in the Lingo docs/examples
but is not the ideal configuration. It has some serious limitations which I'll cover in a bit
along with the preferred "server" configuration.

OptimizerServer The "main" class that exports the OptimizerService simply needs to instantiate the "optimizerServer" bean in the optimizerContent.xml file.

public class OptimizerServer {

public static void main(String[] args) {
if (args.length == 0) {
System.err.println("Usage : java org.sanjiv.lingo.server.OptimizerServer ");
System.exit(-1);
}
String applicationContext = args[0];


System.out.println("Starting Optimizer ...");
FileSystemXmlApplicationContext ctx = new FileSystemXmlApplicationContext(applicationContext);

ctx.getBean("optimizerServer");

System.out.println("Optimizer Started.");


ctx.registerShutdownHook();
}
}

The ClientIn order for the client to lookup the remote OptimizerService, we need to configure the client side Spring application context as follows : Client Application Context - clientContext.xml

xml version="1.0" encoding="UTF-8"?>


<beans>
<bean id="optimizerService" class="org.logicblaze.lingo.jms.JmsProxyFactoryBean">
<property name="serviceInterface" value="org.sanjiv.lingo.common.OptimizerService"/>
<property name="connectionFactory" ref="jmsFactory"/>
<property name="destination" ref="optimizerDestination"/>


<property name="remoteInvocationFactory" ref="invocationFactory"/>
bean>


<bean id="jmsFactory" class="org.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="tcp://localhost:61616"/>
bean>

<bean id="optimizerDestination" class="org.activemq.message.ActiveMQQueue">
<constructor-arg index="0" value="optimizerDestinationQ"/>
bean>

<bean id="invocationFactory" class="org.logicblaze.lingo.LingoRemoteInvocationFactory">
<constructor-arg>
<bean class="org.logicblaze.lingo.SimpleMetadataStrategy">

<constructor-arg value="true"/>
bean>
constructor-arg>
bean>
beans>

Now all a client needs to do to is obtain a handle of the remote OptimizerService by looking up the bean "optimizerService" configured in clientContext.xml.

OptimizerCallback implementationBefore going over the sample Optimizer client code, lets first write a sample implementation of the OptimizerCallback interface - one which the client will register with the remote Optimizer by invoking the registerCallback(..) method.

public class OptimizerCallbackImpl implements OptimizerCallback {

private boolean solveComplete = false;
private OptimizerException callbackError;
private Object mutex = new Object();


public void setPercentageComplete(int pct) {
System.out.println("+++ OptimzierCallback :: " + pct + "% complete..");
}

public void error(OptimizerException ex) {
System.out.println("+++ OptimzierCallback :: Error occured during solve" + ex.getMessage());
callbackError = ex;
solveComplete = true;
synchronized (mutex) {
mutex.notifyAll();
}
}


public void solveComplete(float soltion) {
System.out.println("+++ OptimzierCallback :: Solve Complete with answer : " + soltion);
solveComplete = true;
synchronized (mutex) {
mutex.notifyAll();
}
}


public void waitForSolveComplete() throws OptimizerException {
while (!solveComplete) {
synchronized (mutex) {
try {
mutex.wait();
if (callbackError != null) {
throw callbackError;
}
} catch (InterruptedException e) {
e.printStackTrace();
break;
}
}
}
}
}

OptimizerClient

public class OptimizerClient {

public static void main(String[] args) throws InterruptedException {


if (args.length == 0) {
System.err.println("Usage : java org.sanjiv.lingo.client.OptimizerClient ");
System.exit(-1);
}

String applicationContext = args[0];
FileSystemXmlApplicationContext ctx = new FileSystemXmlApplicationContext(applicationContext);

OptimizerService optimizerService = (OptimizerService) ctx.getBean("optimizerService");
OptimizerCallbackImpl callback = new OptimizerCallbackImpl();


try {
optimizerService.registerCallback(callback);
System.out.println("Client :: Callback Registered.");

optimizerService.solve();
System.out.println("Client :: Solve invoked.");

Thread.sleep(8 * 1000);
System.out.println("Client :: Calling cancel after 8 seconds.");


optimizerService.cancel();
System.out.println("Client :: Cancel finished.");
//callback.waitForSolveComplete();

} catch (OptimizerException e) {
System.err.println("An error was encountered : " + e.getMessage());
}
}
}

The test client registers a callback and calls the asynchronous method solve(). Note that the solve method in our sample OptimizerService implementation takes ~100 seconds to complete. The client then prints out the message "Client :: Solve invoked.". If the solve() call is indeed invoked asynchronously by Lingo under the hoods, this message should be printed to console immediately and not after 100 seconds. The client then calls cancel() after 8 seconds have elapsed.

Here's the output when we run the Optimizer Server and Client

Notice that the solve method has been called asynchronously and after 8 seconds the client makes the cancel() call however the server does not seem to be receiving this call and continues with its setPercentageComplete(..) callback.

I asked this question on the Lingo mailing list but did not get a response. This misbehaviour was pretty serious because what this meant was that while an asynchronous call like solve() was executed asynchronously by the client, the client was not able to make another call like cancel() until the solve() method completed execution on the server... which defeats the purpose of a method like cancel().

Lingo and ActiveMQ are open source so I rolled up my sleeves and ran the whole thing through a debugger. Debugging multithreaded applications can get tricky but after spending several hours I was able to get the to bottom of this issue.

Recollect that we exported the OptimizerSericve using the class org.logicblaze.lingo.jms.JmsServiceExporter in optimizerContext.xml. On examining the source, I found that this class creates a single JMS Session which listens for messages on the configured destination ("optimizerDestinationQ" in our example) and when messages are received, it invokes a Lingo listener which does the translation of the inbound message into a method invocation on the exported OptimizerServiceImpl service object.

The JMS spec clearly states

A Session object is a single-threaded context for producing and consuming messages.
...
It serializes execution of message listeners registered with its message consumers.

Basically a single JMS Session is not suitable for receiving concurrent messages. I understood why the cancel() method wasn't being invoked until the solve() method completed. But this behavior still didn't make sense from an API usage perspective.

Fortunately Spring 2.0 added support classes for receiving concurrent messages which is exactly what we need (yep, Spring rocks!). There are a few different support classes like DefaultMessageListenerContainer, SimpleMessageListenerContainer, and ServerSessionMessageListener .

The ServerSessionMessageListenerContainer "dynamically manages JMS Sessions, potentially using a pool of Sessions that receive messages in parallel". This class "builds on the JMS ServerSessionPool SPI, creating JMS ServerSessions through a pluggable ServerSessionFactory".

I tried altering optimizerContext.xml to use this class optimizerContextPooledSS.xml

xml version="1.0" encoding="UTF-8"?>


<beans>
<bean id="optimizerServiceImpl" class="org.sanjiv.lingo.server.OptimizerServiceImpl" singleton="true">
bean>

<bean id="optimizerServerListener" class="org.logicblaze.lingo.jms.JmsServiceExporterMessageListener">
<property name="service" ref="optimizerServiceImpl"/>
<property name="serviceInterface" value="org.sanjiv.lingo.common.OptimizerService"/>
<property name="connectionFactory" ref="jmsFactory"/>
bean>

<bean id="optimizerServer" class="org.springframework.jms.listener.serversession.ServerSessionMessageListenerContainer">
<property name="destination" ref="optimizerDestination"/>
<property name="messageListener" ref="optimizerServerListener"/>
<property name="connectionFactory" ref="jmsFactory"/>
bean>


<bean id="jmsFactory" class="org.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="tcp://localhost:61616"/>
<property name="useEmbeddedBroker">
<value>truevalue>
property>
bean>

<bean id="optimizerDestination" class="org.activemq.message.ActiveMQQueue">
<constructor-arg index="0" value="optimizerDestinationQ"/>
bean>
beans>

Unfortunately the behavior was still the same - cancel() was not executing on the server until solve() completed. I posted this question on the Spring User list but did not get a response. This class uses the ServerSessionPool SPI so I'm not sure if there is a problem with the Spring class, the ActiveMQ implementation of this SPI or something that I've done wrong.

Anyway I was able to successfully configure the DefaultMessageListenerContainer class and observed the desired behavior. In contrast to ServerSessionMessageListenerContainer, DefaultMessageListenerContainer "creates a fixed number of JMS Sessions to invoke the listener, not allowing for dynamic adaptation to runtime demands". While ServerSessionMessageListenerContainer would have been ideal, DefaultMessageListenerContainer is good enough for most use cases as you'd typically want to have some sort of thread pooled execution on the server anyways.

optimizerContextPooled.xml

xml version="1.0" encoding="UTF-8"?>


<beans>

<bean id="optimizerServiceImpl" class="org.sanjiv.lingo.server.OptimizerServiceImpl" singleton="true">
bean>

<bean id="optimizerServerListener" class="org.logicblaze.lingo.jms.JmsServiceExporterMessageListener">
<property name="service" ref="optimizerServiceImpl"/>
<property name="serviceInterface" value="org.sanjiv.lingo.common.OptimizerService"/>
<property name="connectionFactory" ref="jmsFactory"/>
bean>

<bean id="optimizerServer" class="org.springframework.jms.listener.DefaultMessageListenerContainer">
<property name="concurrentConsumers" value="20"/>
<property name="destination" ref="optimizerDestination"/>
<property name="messageListener" ref="optimizerServerListener"/>
<property name="connectionFactory" ref="jmsFactory"/>
bean>


<bean id="jmsFactory" class="org.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="tcp://localhost:61616"/>
<property name="useEmbeddedBroker">
<value>truevalue>
property>
bean>

<bean id="optimizerDestination" class="org.activemq.message.ActiveMQQueue">
<constructor-arg index="0" value="optimizerDestinationQ"/>
bean>

beans>
Note : Although some Lingo examples have the destination created as a Topic(ActiveMQTopic)
with the org.logicblaze.lingo.jms.JmsServiceExporter class, you must use a Queue when
using multiple JMS sessions for concurrent message retreival as a Topic will be received
by all listeners which is not what we want.

Here's the result when using applicationContextPooled.xml

You can download the complete source for this here and run the sample server and client. JRoller doesn't allow uploading .zip files so I've uploaded the sample as a .jar file instead. The source distribution has a Maven 1.x project file. To build, simply run "maven". To run the optimizer sever without pooled JMS listeners, run startOptimizer.bat under dist/bin/. To run with pooled JMS listeners, run startOptimizerPooled.bat and to run the test client, run startClient.bat

I am using this architecture to provide a remote API for our C++ optimizer. The C++ optimizer has a thin JNI layer which loads the Spring application context file and the OptimizerServiceImpl has a bunch of native methods which is tied to the underlying C++ optimizer functionality using the JNI function RegisterNatives(). Do you Lingo? I'd like to hear how others are using Lingo/Spring Remoting.

Monday, March 12, 2007

在ubuntu下配置vsftp。

唉,这个小东西真烦人。 我在网上找了个贴子,跟着试了试,可以用,就直接贴过来了。
http://linux.hiweed.com/node/1080