Text processing tool - improvement

$30-250 USD

În desfășurare

Data postării:

peste 10 ani în urmă

$30-250 USD

Plata la predare

1. If in one file we go more that one new line symbol program crashes (it thinks it is end of file but it is not). Like: PL EN ABC TY RET YTE REW WSF In such situations we should fill blank are in polish „Pusta linia.” in english ”Empty Line.” 2. Currently spell checker uses Aspell but it is manual. If I wanted to use it I would take me years in some cases :/ I need to make it automatic. Heydari said it is possible before. I was thinking about a solution that will not only compare a word to some dictionary because such solution will probably make a lot of errors. I need it to be done that way but also to generate a dctionary from the text I want to check and to count how many times each word repeats in it. If we know it while spell checking we would compare not only to a dictionary af a language but also to a second dictionary generated no we will have more chance to make good spell checking corrections. It is required to set some acceptance rate to be adcjusted and if impossible to reach that limit leave word as it is. If a word is inside a sentence and starts from capital letter we should leave it as it is. 3. Now program loose much data currently when we find a symbol that does not exist in a polish or english we delete whoole line. It would be better to count only such symbols if they were not find in a word that starts from capital letter (it most like is a name) amd delete a line only if we fint at least 2 such symbols. 4. Now program deletes all symbols in e sentence leaving only dots. I think it would be better to keep also comas „ , ” and „-” and leave dots only at the end of sentence. 5. Currently program lowercases all words inside a sentence so that only first word is capital. But what about a names? I guess after cleaning we should compare a text with names and surnames list and make such words start from a capital letter ( SOME NAMES CONSIST OF MORE THAN ONE WORD !!) 6. If we find a dot inside a sentence most likely it is a shortcut. If we find it we look for it in shortcur list and put instead its full name. If no we just like now simply remove dot. 7. With names and surnames there is little problem :) A name can be simply tha same as simple world. For example Mr. White, and white also is a color. Only idea i have for this is to check if a name also resides in dictionary. If it does to start it from capital letter only when it is after a salutaion like Mr, Mrs, Dear etc. Maybe you got some better idea ? 8. Salutation marks like Mr, Mrs, Dear etc also should start from capital letter. 9. Very problematic was that there were sentences with odd nesting, such as: We can see that some parts (words or full phrases or even whole sentences) were duplicated. Furthermore, there are segments containing repeated whole sentences inside one segment. For instance: Sentence A. Sentence A. e.g. or: Part A, Part B, Part B, Part C e.g. 10. Add as an ption aligner from project 1, please also make a menu to use all features just like it is right now. It is much more convenient than commands. 11. Once more super detailed documentaion - previous one with new improvements - some math if possible :) DATA (should be in easy to edit format): English Dictionary - you find and obtain Polish Dictionary - i provide Polish Names - you download from [login to view URL]:Polski_-_Imiona Polish Surnames - you download from [login to view URL] Polish shortcuts - you download from here [login to view URL] (casing matters) Polish salutations - i will provide English Names - you find and obtain (also names for intitutions etc. not only for humans) English Surnames - you find and obtain English shortcuts - you find and obtain English Salutations - you find and obtain

Software Architecture

ID-ul proiectului: 5309549

Despre proiect

4 propuneri

Proiect la distanță

Activ: 10 ani în urmă

Vrei să câștigi bani?

Adresa de e-mail

Avantajele de a licita pe platforma Freelancer

Stabilește bugetul și intervalul temporal

Îți primești plata pentru serviciile prestate

Evidențiază-ți propunerea

Te înregistrezi și licitezi gratuit pentru proiecte

Acordat utilizatorului:

@helmot

Hi As discussed by email. As discussed by email. As discussed by email. As discussed by email. As discussed by email. As discussed by email.

$200 USD în 10 zile

4,6

(48 recenzii)

6,2

4 freelanceri plasează o ofertă medie de $227 USD pentru proiect

@greenvector

Hi I am electrical engineer and I have sound knowledge of C, Java and perl. I can help you with this text processing tool. Can you please tell me in which language your tool is written in?? Looking forward to your reply...

$250 USD în 20 zile

5,0

(7 recenzii)

3,3

@faheems189

i have read all description and highly interested to start this project, i have team work for good quality work, thanks.

$200 USD în 3 zile

5,0

(5 recenzii)

2,3

@GiveUsYourTask

Dear Customer, I can start now, kindly be noted that I have a professional team to do your work, so, if you interest, kindly send me a PM, in order I can start. Best regards.

$257 USD în 10 zile