1. Merge with Huchi branch (replace all requireLink with enableGrad that allows gradient computation for a tensor); 2. Update the global memory size (This may make the memory size a little bit larger than the old version).