1. If in one file we go more that one new line symbol program crashes (it thinks it is end of file but it is not). Like:
PL EN
ABC TY
RET
YTE
REW WSF
In such situations we should fill blank are in polish „Pusta linia.” in english ”Empty Line.”
2. Currently spell checker uses Aspell but it is manual. If I wanted to use it I would take me years in some cases :/ I need to make it automatic. Heydari said it is possible before. I was thinking about a solution that will not only compare a word to some dictionary because such solution will probably make a lot of errors. I need it to be done that way but also to generate a dctionary from the text I want to check and to count how many times each word repeats in it. If we know it while spell checking we would compare not only to a dictionary af a language but also to a second dictionary generated no we will have more chance to make good spell checking corrections. It is required to set some acceptance rate to be adcjusted and if impossible to reach that limit leave word as it is. If a word is inside a sentence and starts from capital letter we should leave it as it is.
3. Now program loose much data currently when we find a symbol that does not exist in a polish or english we delete whoole line. It would be better to count only such symbols if they were not find in a word that starts from capital letter (it most like is a name) amd delete a line only if we fint at least 2 such symbols.
4. Now program deletes all symbols in e sentence leaving only dots. I think it would be better to keep also comas „ , ” and „-” and leave dots only at the end of sentence.
5. Currently program lowercases all words inside a sentence so that only first word is capital. But what about a names? I guess after cleaning we should compare a text with names and surnames list and make such words start from a capital letter ( SOME NAMES CONSIST OF MORE THAN ONE WORD !!)
6. If we find a dot inside a sentence most likely it is a shortcut. If we find it we look for it in shortcur list and put instead its full name. If no we just like now simply remove dot.
7. With names and surnames there is little problem :) A name can be simply tha same as simple world. For example Mr. White, and white also is a color. Only idea i have for this is to check if a name also resides in dictionary. If it does to start it from capital letter only when it is after a salutaion like Mr, Mrs, Dear etc. Maybe you got some better idea ?
8. Salutation marks like Mr, Mrs, Dear etc also should start from capital letter.
9.
Very problematic was that there were sentences with odd nesting, such as:
We can see that some parts (words or full phrases or even whole sentences) were duplicated. Furthermore, there are segments containing repeated whole sentences inside one segment. For instance:
Sentence A. Sentence A. e.g.
or:
Part A, Part B, Part B, Part C e.g.
10. Add as an ption aligner from project 1, please also make a menu to use all features just like it is right now. It is much more convenient than commands.
11. Once more super detailed documentaion - previous one with new improvements - some math if possible :)
DATA (should be in easy to edit format):
English Dictionary - you find and obtain
Polish Dictionary - i provide
Polish Names - you download from [login to view URL]:Polski_-_Imiona
Polish Surnames - you download from [login to view URL]
Polish shortcuts - you download from here [login to view URL] (casing matters)
Polish salutations - i will provide
English Names - you find and obtain (also names for intitutions etc. not only for humans)
English Surnames - you find and obtain
English shortcuts - you find and obtain
English Salutations - you find and obtain
Hi I am electrical engineer and I have sound knowledge of C, Java and perl. I can help you with this text processing tool. Can you please tell me in which language your tool is written in?? Looking forward to your reply...
Dear Customer,
I can start now, kindly be noted that I have a professional team to do your work, so, if you interest, kindly send me a PM, in order I can start.
Best regards.