Problem Restructuring in Integer Programming for Reduct Searching
Ungku Chulan, Ungku Azmi Iskandar (2003) Problem Restructuring in Integer Programming for Reduct Searching. Masters thesis, Universiti Putra Malaysia.
Standard Integer Programming / Decision Related Integer Programming (SIP/DRIP) is a reduct searching system that finds the reducts in an information system. These reducts are the minimal attributes of the information system that are useful in classificatory task. They can describe the whole information system when implementing discernment. In effect, they are very useful in generating rules when solving the classification problem that is inherent in data mining. The thesis emphasizes mainly on the improvement of the original SIP/DRIP algorithm in term of performance. By using problem restructuring, the searching time and memory are minimized. Simultaneously the approach adheres to an essential criterion of the original method. That is, to improve performance without sacrificing the quality of the reduct.Problem restructuring changes the input of the SIP/DRIP without losing any of inpufs essential properties. In other words, no lost of knowledge occurs with problem restructuring. Only the structural order changes, with the contents kept intact. This hypothetically ensures that no adverse distortion transpired within SIP/DRIP. Restructuring is done by speculating a promising structure for the input to SIP/DRIP based on the potentiality of the attributes in the information system. It uses a nonexpensive approach to predict potentiality. Simply, based on the total covering of each attributes within the information system. Although this measurement is just an approximation, it can be proven to work. To implement the experiment, five data sets were taken. They are gathered from the UCI machine learning repositories. Results are measured by comparing the performance of SIP/DRIP with and without problem restructuring. In addition, the length of reducts generated by both approaches are also compared to ensure that no quality deterioration occurred along the way. Experimental results have shown that problem restructuring generally improves SIP/DRIP for all the data sets. This means that on average, it would enhance the performance of SIP/DRIP. The consumption of time and space were minimized quite significantly. Furthermore, the quality of the solutions was also successfully maintained. There was no increase in reduct length when using it. The concept offered by the approach is an additional component to SIP/DRIP. It complements the process of searching done. By giving more consideration to the initial problem space and not just the searching of the solution, the performance of SIP/DRIP has been humbly improved.
Repository Staff Only: Edit item detail